0% found this document useful (0 votes)
39 views196 pages

Mod 1 DBMS

Uploaded by

prasheelkarkera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views196 pages

Mod 1 DBMS

Uploaded by

prasheelkarkera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 196

DATABASE

MANAGEMENT SYSTEM
MODULE1
COURSE OUTCOMES
• Understand the basic concepts of data base system
• Develop appropriate databases by applying the various concepts of Relational
Model.

• Understand and apply Structure Query Language (SQL) to solve various


database operations.

• Design standard databases for various real world problems


• Illustrate the basic concepts of transaction processing in Database System.
CHAPTER 1
Introduction to Databases
Topics
• Introduction
• Example of a Database
• Main Characteristics of Database Technology
• Additional Benefits of Database Technology
• When Not to Use a DBMS
• Data Models
• - History of data Models
• - Network Data Model
• - Hierarchical Data Model
TOPICS
• Schemas versus Instances
• Three-Schema Architecture
• Data Independence
• DBMS Languages
• DBMS Interfaces
• DBMS Component Modules
• Database System Utilities
• Classification of DBMSs
INTRODUCTION
• A database is a collection of related data.

• A data mean known facts that can be recorded and that have
implicit meaning.

• Structured Data – texture and numeric format


• Unstructured data – audio, video, images
• Traditional database – which handle structured data
• Consider the names, telephone numbers, and addresses
of the people you know. You may have recorded this
data in an indexed address book or you may have stored
it on a hard drive, using a personal computer and
software such as Microsoft Access or Excel.

• This collection of related data with an implicit meaning


is a database.
• Student Database- information about student details
• University Database- information about universe
• Company Database- information about company
• A database has the following implicit properties:

• A database is designed, built, and populated with data for a


specific purpose

A database represents some aspect of the real world, sometimes


called the miniworld or the universe of discourse (UoD).


• A database can be of any size and complexity.
• For example, the list of names and addresses referred to earlier
may consist of only a few hundred records, each with a simple
structure.
• An example of a large commercial database is Amazon.com.
• A database may be generated and maintained manually or it
may be computerized.
• For example, a library card catalog is a database that may be
created and maintained manually.
• A computerized database may be created and maintained either
by a group of application programs written specifically for that
task or by a database management system.
• A database management system (DBMS) is a collection of
programs that enables users to create and maintain a database.

• The DBMS is a general-purpose software system that facilitates

the processes of defining, constructing, manipulating, and


sharing databases among various users and applications.

• Defining a database involves specifying the data types,

structures, and constraints of the data to be stored in the


database.

• The database definition or descriptive information is also stored


by the DBMS in the form of a database catalog or dictionary; it is
called meta-data.
• Constructing the database is the process of storing the data
on some storage medium that is controlled by the DBMS.

• Manipulating a database includes functions such as querying


the database to retrieve specific data, updating the database
to reflect changes in the miniworld, and generating reports
from the data.

• Sharing a database allows database access to multiple users


and programs simultaneously
•Other important functions provided by the DBMS :
• protecting the database:
• protection against hardware or software malfunction (or
crashes) and

• security protection against unauthorized or malicious access

and

• maintaining it over a long period of time.

•Database and DBMS software together a Database


A Simple Database Environment
EXAMPLE - University database
• We must also specify a data type for each data element
within a record. For example, we can specify that
• Name of STUDENT is a string of alphabetic characters,
• Student_number of STUDENT is an integer, and
• Grade of GRADE_REPORT is single character from the set
{‘A’, ‘B’, ‘C’, ‘D’, ‘F’, ‘I’}.
• Database manipulation involves querying and updating.
Examples of queries are as follows:
■ Retrieve the transcript—a list of all courses and grades—of
‘Smith’
■ List the names of students who took the section of the
‘Database’ course offered in fall 2008 and their grades in that
section
■ List the prerequisites of the ‘Database’ course
• Examples of updates include the following:
■ Change the class of ‘Smith’ to sophomore
■ Create a new section for the ‘Database’ course for this
semester
■ Enter a grade of ‘A’ for ‘Smith’ in the ‘Database’ section of last
semester
Characteristics of the Database Approach

• The main characteristics of the database approach versus the


file-processing approach are the following:
✔ Self-describing nature of a database system
✔ Insulation between programs and data, and data abstraction
✔ Support of multiple views of the data
✔ Sharing of data and multiuser transaction processing
Self-Describing Nature of a Database System
DB approach
• Database system contains
• the database and
• a complete definition or description of the database structure and constraints 🡪
stored in a database catalog.
• DBMS catalog contains information such as
🡪 the structure of each file
🡪 the type and storage format of each data item
🡪 and various constraints on the data.
• The information stored in the catalog is called meta-data
• Whenever a request is made to access, say, the Name of a STUDENT record, the
DBMS software refers to the catalog to determine the structure of the STUDENT
file and the position and size of the Name data item within a STUDENT record
Traditional file processing

• In traditional file processing, data definition is typically part of


the application programs themselves.

• Hence, these programs are constrained to work with only one


specific database, whose structure is declared in the application
programs.
• Consider a college where student data is needed by more than 1
department.
• Example, CSE student data is also needed by accounts
department, transport department etc.

• In database approach, single repository of data is maintained and


used by all the departments in the organization.

• While in traditional file approach, different sets of data is


maintained by the departments. This causes data redundancy and
any changes made by one department will not be reflected in the
files maintained by other departments
Insulation between programs and data, and data abstraction
• In traditional file processing, the structure of data files is
embedded in the application programs
• any changes to the structure of a file may require changing all
programs that access that file.

• In a file system, if changes are made in the file structure,


changes have to be made in the programs that access the
file.

• Ex. To add a piece of data (ex. DoB), in a DB system, we simply


have to add another item in the catalog and the changes will
be reflected, Whole program need not be changed here.
Support of Multiple Views of the Data
• Different types of users, require a different perspective or view of
the database.
• A view may be a subset of the database or it may contain virtual
data that is derived from the database files but is not explicitly
stored.
• A multiuser DBMS whose users have a variety of distinct
applications must provide facilities for defining multiple views.
• For example, one user of the database may be interested only in
accessing and printing marks of each student. A second user, who is
interested only in checking that students have taken all the
prerequisites of each course for which the student registers, may
require the view
Sharing of Data and Multiuser Transaction Processing
• A multiuser DBMS, as its name implies, must allow multiple users to
access the database at the same time.

• This is essential if data for multiple applications is to be integrated


and maintained in a single database.

• A fundamental role of multiuser DBMS software is to ensure that


concurrent transactions operate correctly and efficiently.

• A transaction is an executing program or process that includes one


or more database accesses, such as reading or updating of database
records.
DATABASE USERS

Actors on the Scene Workers behind the Scene

• Their job involves using large database • Maintaining the database system
everyday. environment .

1. Database Administrators 1. DBMS system designers and implementers

2. Database Designers 2. Tool developers design and implement tools

3. End Users 3. Operators and maintenance personnel

4. System Analysts, Application Programmers,


Software Engineers
Actors on the Scene
Database Administrators
• In a database environment, the primary resource ->the database and the
secondary resource ->DBMS and related software.
• Database administrator (DBA) responsibilities:
1. Administering primary/secondary resources
2. Authorizing access to the database
3. Coordinating and monitoring use of database
4. Acquiring software and hardware resources as needed.
5. The also monitor the database against security breaches and poor system
response times
Actors on the Scene
Database Designers
• They are involved before the database is actually
implemented and populated with data. They
1. Identify the data to be stored in the database and
2. Choose appropriate structures to represent and store the data.
3. Communicate with all prospective database users ->understand
their requirements and to create a design that meets these
requirements.
4. Develop views of the database that meet the data and
processing requirements of users groups
Actors on the Scene
End Users :
• people whose jobs require access to the database -> for querying, updating,
and generating report.
Several categories of end users:
• Casual end users : Access database occasionally -> typically middle- or
high-level managers or other occasional browsers.
• Naive or parametric end users : constantly query and update the database,
using standard types of queries and updates—called canned transactions
• Sophisticated end users:engineer, scientists, business analyst who are
familiar with DBMS and use it to meet their complex requirement
• Standalone users: maintain personal database by using ready made
System Analysts, Application Programmers, Software Engineers:
• System Analysts: They develop specifications for canned transactions that
meet the needs of naïve end users.

• Application Programmers: Implement, test, document, and maintain


programs that satisfy the specifications mentioned above.

• Software developers: they should be familiar with full range of capabilities


provided by DBMS
Workers behind the Scene

• Persons are typically not interested in the database content itself


known as workers behind the scene
• DBMS system designers and implementers design and implement the
DBMS modules and interfaces as a software package.
• Tool developers design and implement tools: the software
packages(ex. packages for database design , performance monitoring,
graphical interfaces ) that facilitate database modeling and design,
database system design, and improved performance.
• Operators and maintenance personnel -are responsible for the actual
running and maintenance of the hardware and software environment
for the database system.
Advantages of Using the DBMS Approach
1. Controlling Redundancy

2. Restricting Unauthorized Access

3. Providing Persistent Storage for Program Objects

4. Providing Storage Structures and Search Techniques for


Efficient Query Processing

5. Providing Backup and Recovery


6. Providing Multiple User Interfaces

7. Representing Complex Relationships Among Data

8. Enforcing Integrity Constraints

9. Permitting Inferencing and Actions Via Rules


Advantages of Using the DBMS Approach
Controlling Redundancy

•It is avoidance of storing same data multiple


times.

•Why redundancy has to be controlled?


• Wastage of storage space
• Duplication of effort
• Inconsistent data.
Restricting Unauthorized Access:

• When multiple users share a large database, the type access operations must
be controlled.

• A DBMS should provide a security and authorization subsystem, which the


DBA uses to create accounts and to specify account restrictions.

• Then, the DBMS should enforce these restrictions automatically.


• Ex. parametric users may be allowed to access the database only through the
predefined apps or canned transaction.
Providing Persistent Storage for Program Objects:

Traditional Programs DBMS


• Once program terminates -> values of
program variable -> discarded
• Once program terminates -> values of

program variable -> Not discarded


• The data should be transferred to files ->stores objects permanently. Such an
object is said to be persistent
• While reading the data, this file should
be converted to program variable or
object
Providing Storage Structures and Search
Techniques for Efficient Query Processing:

• Database systems must provide capabilities for efficiently executing queries


and updates.
• Sincedatabase is stored on disk , DBMS must provide specialized data
structures (indexes) to speed up disk search.
• DBMS often has a buffering or caching module to maintains parts of the
database in main memory buffers
• The query processing and optimization module of the DBMS is responsible
for choosing an efficient query execution.
Providing Backup and Recovery:

• DBMS must provide facilities for recovering from hardware or software


failures.

• backup and recovery subsystem of the DBMS is responsible for recovery.


• Ex: If the computer crashes during a complex transaction -> The
recovery subsystem -> responsible for ensuring that the transaction
resumes from the point at which it was interrupted or atleast restore to
the state where it was before transaction started executing.

• Disk backup is also necessary in case of a catastrophic disk failure.


Providing Multiple User Interfaces:

• Multiple
users 🡪different levels of technical knowledge 🡪
DBMS should provide a variety of user interfaces.

• Ex:
• query languages 🡪 casual users,

• programming language interfaces 🡪application programmers,

• forms and/or command codes 🡪 parametric users,

• menu-driven interfaces (GUIs) 🡪 stand-alone users.


Representing Complex Relationships Among Data:

• A database may have variety of data 🡪 interrelated in many ways

• DBMS must be capable of:


🡪 Representing complex relationships among data
🡪 Retrieve and update related data easily and efficiently
Enforcing Integrity Constraints
• Most database applications have certain integrity constraints that
must hold for the data.
• Constraints is nothing but restrictions that are going to impose on
data
• The simplest type of integrity constraint involves specifying a data
type for each data item.
• a record in one file must be related to records in other files,
referential integrity constraint must be maintained.
• Another type of constraint specifies uniqueness on data item values,
this is known as a key or uniqueness constraint.
• Responsibility of database designers 🡪 identifying integrity
Permitting Inferencing and Actions Via Rules and triggers

• Database systems sometimes provide capabilities for defining deduction rules for
inferencing new information.

• Such systems 🡪 deductive database systems

• Triggers 🡪 Rules activated by updates to the table, which results in performing


some additional operations to some other tables, sending messages etc.
A Brief History of Database Applications

• Early Database Applications:

• ✔ The Hierarchical and Network Models were introduced in mid 1960s and dominated during the
70’s

• ✔ A bulk of the worldwide database processing still occurs using these models

• Relational Model based Systems:

• ✔ Relational model was originally introduced in 1970, was heavily researched and experimented
with in IBM Research and several universities

• Object-oriented and emerging applications: Object-Oriented Database Management Systems


(OODBMSs) were introduced in late 1980s and early 1990s to cater to the need of complex data
processing in CAD and other applications.
CHAPTER 2
OVERVIEW OF DATABASE
LANGUAGES AND
ARCHITECTURES
DATA MODELS
• Data abstraction 🡪suppression of details of data
organization and storage🡪 highlighting of the essential
features for an improved understanding of data.

• Technique to provide data to different users in the way how


they perceive data according to their knowledge or
requirement.

• A data model 🡪collection of concepts🡪 used to describe


the structure of a database 🡪 helps to achieve data
abstraction
Categories of Data Models

• 1. High-level or conceptual data models

• 2.Low-level or physical data models

• 3. Representational (or implementation) data models


High-level or conceptual data models

• Provide concepts that are close to the way many users perceive data
• use concepts such as entities, attributes, and relationships.
• Entities🡪 represents a real-world object or concept
• Attribute 🡪further describes an entity, such as the employee’s name or salary.
• Relationship 🡪association among 2 or more entities., for example, a works-on
relationship between an employee and a project.
• Entity-relationship model is a popular high level model conceptual data model.
2. Low-level or physical data models

• Describe how data is stored on the computer storage media, typically


magnetic disks.

• Concepts provided by low-level data models are generally meant for


computer specialists, not for end users
3. Representational (or implementation) data models

• provide concepts that may be easily understood by end users


and information about the way data is organized in computer
storage.

• hide many details of data storage on disk


• Relational data model is an example of representational data
model.
Schemas

• The description of a database is called the database schema, which is


specified during database design and is not expected to change frequently.

• A displayed schema is called a schema diagram


• Object in the schema—such as STUDENT or COURSE—a schema construct
• Database state (or instance or snapshot):The data stored in the database at
a particular moment
• new database = specifying database schema to the DBMS.
• At this point, database state = empty state with no data.
• On populating or loading with the initial data 🡪 Initial state
• From then on, every time an update operation is applied to the
database, we get another database state.

• At any point in time, the database has a current state.


• The DBMS is partly responsible for ensuring that every state of the
database is a valid state—that is, a state that satisfies the structure and
constraints specified in the schema.
Three schema architecture and Data Independence

The Three-Schema Architecture


•The goal of the three-schema architecture is to separate
the user applications from the physical database.
•The schemas are descriptions of data
•the actual data is stored at the physical level only
In this architecture, schemas can be defined at the following three
levels:
1. The internal level has an internal schema,

• describes the physical storage structure of the database.


• How records are being stored, how records are ordered in the
database

• uses a physical data model and describes the complete details


of data storage and access paths for the database.
2. The conceptual level has a conceptual schema,

• describes the structure of the whole database for a community


of users

• hides the details of physical storage structures and concentrates


on describing entities, data types, relationships, user
operations, and constraints

• representational data model is used to describe the conceptual


schema when a database system is implemented
3. The external or view level includes a number of
external schemas or user views,

• describes the part of the database that a particular user


group is interested in and hides the rest of the database
from that user group.

• implemented using a representational data model,


possibly based on an external schema design in a
high-level data model.
• For each set of user groups a different external schema is being
described.

• When a request is sent from an external view that need to be


transformed to the conceptual level and again that need to
transform to the internal level.

• The method of performing request from the external level to


conceptual level is called external/conceptual mapping.

• From the conceptual level it is need to transformed into a level


that is being understood by the physical level . So that mapping
is called as conceptual/internal mapping
Data Independence

• It is the capacity to change the schema at one level of a


database system without having to change the schema
at the next higher level.

• We can define two types of data independence:


1. Logical Data Independence
2. Physical Data Independence
1. Logical data independence

•Capacity to change the conceptual schema without having


to change external schemas or application programs

•The conceptual schema could have been changed to:


• expand the database,
• change constraints, or
• reduce the database
2. Physical data independence
• It is the capacity to change the internal schema without having
to change the conceptual schema.

• The external schemas need not be changed as well


• The internal schema could be changed to inprove the
performance of retrieval or update.
• DBMS packages provide
DBMSan Languages
integrated feature of above languages
into a single language called Structured Query Language.

• Data definition language (DDL):


• If no separation of schemas maintained between conceptual and internal
levels:
• used to define both schemas.

• used by DBA and database designers.


• DDL Compiler processes DDL statements and stoes the schema descriptions in the
DBMS catalog

• where a clear separation is maintained between conceptual and


internal levels
• DDL used to specify conceptual schema only
• View definition language (VDL), to specify user views and their
mappings to the conceptual schema.

• Data manipulation language(DML) provides set of operations like


retrieval, insertion, deletion, and modification of the data.

• In real world, a comprehensive integrated language is used that


includes constructs for
• conceptual schema definition,
• view definition, and
• data manipulation.
• Storage definition is typically kept separate, since it is used for defining physical storage
structures to fine-tune the performance of the database system, which is usually done by the
DBA staff
Two main types of DMLs.

A high-level or nonprocedural DML A low-level or procedural DML


• Used to specify complex database • Embedded in a general-purpose
operations concisely programming language
• Can specify and retrieve many • To retrieve and process each record
records in a single DML statement from a set of records.
• they are called set-at-a-time or • Low-level DMLs are also called
set-oriented DMLs record-at-a-time DMLs
DBMS Interfaces
Menu-Based Interfaces for Web Clients or Browsing. These interfaces present
the user with lists of options (called menus) that lead the user through the
formulation of a request.

Menu and submenu options.


• Forms-Based Interfaces displays a form to each user. Users can fill out all of
the form entries to insert new data, or they can fill out only certain entries, in
which case the DBMS will retrieve matching data for the remaining entries.
• Graphical User Interfaces
• displays a schema to the user in diagrammatic form.
• The user then can specify a query by manipulating the diagram.
• GUIs utilize both menus and forms. Most GUIs use a pointing device.
• Natural Language Interfaces accepts requests written in English or
some other language and attempt to understand them.
• Keyword-based Database Search similar to Web search engines,
which accept strings of natural language words and match them with
documents at specific sites
• Speech Input and Output use of speech as an input query and speech
as an answer to a question or result. The speech input is detected
using a library of predefined words and used to set up the parameters
that are supplied to the queries.
•Interfaces for Parametric Users such as bank tellers, often
have a small set of operations that they must perform
repeatedly.

•Interfaces for the DBA. DBA use privileged commands.


These include commands for
• creating accounts,
• setting system parameters,
• granting account authorization,
• changing a schema, and
various users
of the
database
environment
and their
interfaces

internal modules of the


DBMS responsible for
storage of data and
processing of
transactions.
• Application programmers🡪 develop application programs
• DBA 🡪 privilege to modify the database , providing access
permission to database using privileged commands.

• Casual users-🡪manager level 🡪interact with the database


using queries.

• Parametric users 🡪perform same operations frequently


• The database and the DBMS catalog are usually stored on
disk.

• Access to the disk is controlled by the operating system ,


which schedules disk input/output.

• Stored data manager 🡪controls access to DBMS information


that is stored on disk, whether it is part of the database or the
catalog 🡪 data transfer

• The top part of the figure refers to the various users of the
database environment and their interface
• The DDL compiler🡪 processes schema definitions,
specified in the DDL, 🡪and stores descriptions of the
schemas (meta-data) in the DBMS catalog

• The catalog includes information such as the names and


sizes of files, names and data types of data items,
storage details of each file, mapping information among
schemas, and constraints

• Query compiler 🡪 handles high level queries


• Pre-compiler 🡪 extracts DML commands from an
application programs.

• DML Compiler 🡪 compilation of DML commands into


object code

• Runtime database processor 🡪 handles database


access at run time

• Stored data manager🡪 data transfer between disk and


main memory
•Application programmers
✔ write programs in host languages such as Java, C, or C++ that are
submitted to a precompiler
✔ pre compiler extracts DML commands from an application program
✔ commands are sent to the DML compiler for compilation
✔ rest of the program is sent to the host language compiler
✔ The object codes for the DML commands and the rest of the program
are linked, forming a canned transaction
✔ An example is a bank withdrawal transaction where the account number
and the amount may be supplied as parameters
In the lower part of Figure,
✔ the runtime database processor executes

(1) the privileged commands


(2) the executable query plans, and
(3) the canned transactions with runtime parameters.
It works with the system catalog and may update it with statistics
✔ It also works with the stored data manager, which in turn uses basic
operating system services for carrying out low-level input/output (read/write)
operations between the disk and main memory
✔ The runtime database processor handles other aspects of data transfer, such
as management of buffers concurrency control and backup and recovery
systems, integrated into the working of the runtime database processor for
purposes of transaction management
Chapter 6
BASIC SQL
•SQL stands for Structured Query Language
•SQL is now the standard language for commercial relational DBMSs
•SQL is a comprehensive database language:
• It has statements for data definitions, queries, and updates.
• Hence, it is both a DDL and a DML.
•SQL has facilities for
• defining views on the database,
• specifying security and authorization,
• defining integrity constraints, and
• specifying transaction controls.
SQL Data Definition and Data Types

•SQL terms:
• table 🡪 relation
•Row 🡪 tuple
•Column🡪 attribute
Schema and Catalog Concepts in SQL
• Schema consists of tables and other constructs that belong to the same
database application

• SQL schema is identified by a schema name and consists of


• an authorization identifier to indicate the user or account who owns the schema,
• descriptors for each element in the schema.
• Schema elements include tables, types, constraints, views, domains, and
other constructs (such as authorization grants) that describe the schema
CREATE SCHEMA COMPANY AUTHORIZATION ‘Jsmith’;

• The privilege to create schemas, tables, and other constructs must be


explicitly granted to the relevant user accounts by the system administrator
or DBA.
CATALOG
•SQL uses the concept of a catalog
• a named collection of schemas
•A catalog always contains a special schema called
INFORMATION_SCHEMA, which provides information
on all the schemas in the catalog and all the element
descriptors in these schemas.

•Schemas within the same catalog can also share certain


elements, such as type and domain definitions.
CREATE TABLE Command in SQL
• specify a new relation by
• giving it a name and
• specifying its attributes and
• initial constraints.
• The attributes are specified first, and each attribute is given
• a name,
• a data type to specify its domain of values, and
• possibly attribute constraints, such as NOT NULL
• The key, entity integrity, and referential integrity constraints can be
specified within the CREATE TABLE statement after the attributes are
declared, or they can be added later using the ALTER TABLE command
In SQL table,
the attributes are ordered
but rows (tuples) are not considered to be ordered.
Attribute Data Types and Domains in SQL

The basic data types available for attributes include numeric, character string,
bit string, Boolean, date, and time.
• Numeric
• Character-string
• Bit-string-
• Boolean
• Date
Basic Data Types

Numeric data types includes


• integer numbers of various sizes (INTEGER or INT, and SMALLINT)
• floating-point (real) numbers of various precision (FLOAT or REAL, and
DOUBLE PRECISION).
• Formatted numbers can be declared by using
DECIMAL(i,j) or
DEC(i,j) or
NUMERIC(i,j)
— where i - precision, total number of decimal digits
and j - scale, number of digits after the decimal point
Character-string data types
• fixed length—CHAR(n) or CHARACTER(n), where n is the number of
characters
• varying length—VARCHAR(n), where n is the maximum number of
characters
• When specifying a literal string value, it is placed between single quotation
marks (apostrophes), and it is case sensitive
• For fixed length strings, a shorter string is padded with blank characters to
the right
• For example, if the value ‘Smith’ is for an attribute of type CHAR(10), it is
padded with five blank characters to become ‘Smith ’ if needed
• Padded blanks are generally ignored when strings are compared
• Another variable-length string data type called CHARACTER LARGE OBJECT
or CLOB is also available to specify columns that have large text values, such
as documents
• The CLOB maximum length can be specified in kilobytes (K), megabytes (M),
or gigabytes (G)
• For example, CLOB(20M) specifies a maximum length of 20 megabytes
Bit-string data types are either of
• fixed length n—BIT(n)—or
• varying length—BIT VARYING(n), where n is the maximum number of bits.
• The default for n, the length of a character string or bit string, is 1
• Literal bit strings are placed between single quotes but preceded by a B to
distinguish them from character strings; for example, B‘10101’
• Another variable-length bitstring data type called BINARY LARGE OBJECT
or BLOB is also available to specify columns that have large binary values,
such as images.
• The maximum length of a BLOB can be specified in kilobits (K), megabits
(M), or gigabits (G)
• For example, BLOB(30G) specifies a maximum length of 30 gigabits.
A Boolean data type has the traditional values of TRUE or FALSE.
• In SQL, because of the presence of NULL values, a three-valued logic is
used, so a third possible value for a Boolean data type is UNKNOWN

• The DATE data type has ten positions, and its components are YEAR,
MONTH, and DAY in the form YYYY-MM-DD
• The TIME data type has at least eight positions, with the components HOUR,
MINUTE, and SECOND in the form HH:MM:SS.
Data type Description

A medium integer. Signed range is from -2147483648 to 2147483647.


INT(size) Unsigned range is from 0 to 4294967295. The size parameter specifies the
maximum display width (which is 255)

INTEGER(size) Equal to INT(size)

A small integer. Signed range is from -32768 to 32767. Unsigned range is


SMALLINT(size) from 0 to 65535. The size parameter specifies the maximum display width
(which is 255)

A floating point number. MySQL uses the p value to determine whether to


use FLOAT or DOUBLE for the resulting data type. If p is from 0 to 24, the
FLOAT(p)
data type becomes FLOAT(). If p is from 25 to 53, the data type becomes
DOUBLE()

DOUBLE PRECISION(size, d)

An exact fixed-point number. The total number of digits is specified in size.


The number of digits after the decimal point is specified in the d parameter.
DECIMAL(size, d)
The maximum number for size is 65. The maximum number for d is 30. The
default value for size is 10. The default value for d is 0.

DEC(size, d) Equal to DECIMAL(size,d)


Data type Description

CHAR(size) A FIXED length string (can contain letters, numbers, and special characters). The size parameter specifies the
column length in characters - can be from 0 to 255. Default is 1

VARCHAR(size) A VARIABLE length string (can contain letters, numbers, and special characters). The size parameter specifies
the maximum string length in characters - can be from 0 to 65535

Data type Description

BIT(size) A bit-value type. The number of bits per value is specified in size. The size parameter can hold a value
from 1 to 64. The default value for size is 1.

Data type Description

DATE A date. Format: YYYY-MM-DD. The supported range is from '1000-01-01' to '9999-12-31'

DATETIME(fsp) A date and time combination. Format: YYYY-MM-DD hh:mm:ss. The supported range is from '1000-01-01
00:00:00' to '9999-12-31 23:59:59'. Adding DEFAULT and ON UPDATE in the column definition to get
automatic initialization and updating to the current date and time

TIME(fsp) A time. Format: hh:mm:ss. The supported range is from '-838:59:59' to '838:59:59'

TIMESTAMP(fsp) A timestamp. TIMESTAMP values are stored as the number of seconds since the Unix epoch ('1970-01-01
00:00:00' UTC). Format: YYYY-MM-DD hh:mm:ss. The supported range is from '1970-01-01 00:00:01' UTC to
'2038-01-09 03:14:07' UTC. Automatic initialization and updating to the current date and time can be
specified using DEFAULT CURRENT_TIMESTAMP and ON UPDATE CURRENT_TIMESTAMP in the column
definition
Specifying Constraints in SQL

These include
• key and referential integrity constraints,
• restrictions on attribute domains and NULLs, and
• constraints on individual tuples within a relation using the CHECK clause.
The following constraints are commonly used in SQL:

● NOT NULL - Ensures that a column cannot have a NULL value


● UNIQUE - Ensures that all values in a column are different
● PRIMARY KEY - A combination of a NOT NULL and UNIQUE. Uniquely
identifies each row in a table
● FOREIGN KEY - Prevents actions that would destroy links between tables
● CHECK - Ensures that the values in a column satisfies a specific condition
● DEFAULT - Sets a default value for a column if no value is specified
● CREATE INDEX - Used to create and retrieve data from the database very
quickly
Specifying Attribute Constraints and Attribute
Defaults
NOT NULL
• if NULL is not permitted for a particular attribute.
• Generally used for attributes that are
• part of the primary key OR
• any other attributes whose can not be NULL
CREATE TABLE Persons (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255) NOT NULL,
Age int
);
DEFAULT
• to give a default value for an attribute
• The default value is included in any new tuple if an explicit value
is not provided for that attribute
• If no default clause is specified, the default default value is
NULL for attributes that do not have the NOT NULL constraint.

CREATE TABLE Persons (


ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
City varchar(255) DEFAULT 'Bengaluru'
);
CHECK
Another type of constraint can restrict attribute or domain values using the
CHECK clause following an attribute or domain definition.

Dnumber INT NOT NULL CHECK (Dnumber > 0 AND Dnumber < 21);

CREATE TABLE Persons (


ID int NOT NULL, CREATE TABLE Persons (
LastName varchar(255) NOT NULL, ID int NOT NULL,
FirstName varchar(255), LastName varchar(255) NOT NULL,
Age int CHECK (Age>=18) FirstName varchar(255),
); Age int,
CHECK (Age>=18)
);
Specifying Constraints on Tuples Using CHECK
In addition to key and referential integrity constraints, which are specified by
special keywords, other table constraints can be specified through additional
CHECK clauses at the end of a CREATE TABLE statement.
These can be called row-based constraints because they apply to each row
individually and are checked whenever a row is inserted or modified.
For example, suppose that the DEPARTMENT table in Figure 6.1 had an
additional attribute Dept_create_date, which stores the date when the
department was created.
Then we could add the following CHECK clause at the end of the CREATE
TABLE statement for the DEPARTMENT table to make sure that a manager’s
start date is later than the department creation date
CHECK (Dept_create_date <= Mgr_start_date)
Specifying Key and Referential Integrity Constraints
PRIMARY KEY
• The PRIMARY KEY clause specifies one or more attributes that make up the
primary key of a relation.
• If a primary key has a single attribute, the clause can follow the attribute
directly.
• For example, the primary key of DEPARTMENT can be specified as follows
Dnumber INT PRIMARY KEY

CREATE TABLE Persons ( CREATE TABLE Persons (


PersonID int NOT NULL PRIMARY KEY, PersonID int NOT NULL,
LastName varchar(255) NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
FirstName varchar(255), Age int,
Age int PRIMARY KEY (PersonID)
); );
UNIQUE
• The UNIQUE clause specifies alternate (unique) keys, also known as
candidate keys
• The UNIQUE clause can also be specified directly for a unique key if it is a
single attribute, as in the following example:

Dname VARCHAR(15) UNIQUE

CREATE TABLE Persons (


CREATE TABLE Persons ( PersonID int NOT NULL,
PersonID int NOT NULL UNIQUE, LastName varchar(255) NOT NULL,
LastName varchar(255) NOT NULL, FirstName varchar(255),
FirstName varchar(255),
Age int,
Age int
UNIQUE (ID)
);
);
FOREIGN KEY

CREATE TABLE Orders (


OrderID int NOT NULL,
OrderNumber int NOT NULL,
PersonID int,
PRIMARY KEY (OrderID),
FOREIGN KEY (PersonID) REFERENCES Persons(PersonID)
);

CREATE TABLE Orders (


OrderID int NOT NULL PRIMARY KEY,
OrderNumber int NOT NULL,
PersonID int FOREIGN KEY REFERENCES Persons(PersonID)
);
FOREIGN KEY

Orders
Persons OrderID OrderNumber PersonID
PersonID LastName FirstName Age

1 77895 3
1 Hansen Ola 30
2 44678 3

2 Svendson Tove 23 3 22456 2

4 24562 1
3 Pettersen Kari 20

Referenced Table Referencing table


FOREIGN KEY
• Used to specify referential integrity
• A referential integrity constraint can be violated when
• tuples are inserted or deleted, or
• when a foreign key or primary key attribute value is updated.
• The default action for integrity violation:
• reject the update operation that will cause a violation
Referenced Table Referencing table

Insert new row - No Violation Insert/ Update- May cause violation


Use ON DELETE CASCADE
or ON DELETE SET NULL

Delete/Update - May cause violation if the tuple is referenced Delete - No Violation


in referencing table
use ON DELETE CASCADE
or ON DELETE SET NULL
An alternative action to be taken can be specified by attaching a referential
triggered action clause to any foreign key constraint.
eg. SET NULL, CASCADE, and SET DEFAULT.
An option must be qualified with either ON DELETE or ON UPDATE.

SET NULL → referencing attributes value changes to NULL


SET DEFAULT → referencing attribute value changes to the default value.
ON DELETE CASCADE → delete all the referencing tuples
ON UPDATE CASCADE → change the value of the referencing foreign key
attribute(s) to the updated (new) primary key value for all the referencing
tuples
Giving Names to Constraints

• A constraint may be given a constraint name, following the keyword


CONSTRAINT.
• The names of all constraints within a particular schema must be unique.
• A constraint name is used to identify a particular constraint in case the
constraint must be dropped later and replaced with another constraint
• Giving names to constraints is optional.

ALTER TABLE table_name ADD CONSTRAINT constraint_name constraint_clause

Identify the constraint names in EMPLOYEE and DEPARTMENT Tables


Basic Retrieval Queries in SQL

The SELECT-FROM-WHERE Structure of Basic SQL Queries


SELECT <attribute list>
FROM <table list>
WHERE <condition>;
where
■ <attribute list> is a list of attribute names whose values are to be retrieved by
the query.
■ <table list> is a list of the relation names required to process the query.
■ <condition> is a conditional (Boolean) expression that identifies the tuples
to be retrieved by the query.
Basic Retrieval Queries in SQL

SELECT-FROM-WHERE
SELECT <attribute list>
FROM <table list>
WHERE <condition>;
Where,
<attribute list> is a list of attribute names whose values are to be retrieved
by the query
<table list> is a list of the relation names required to process the query
<condition> is a conditional (Boolean) expression that identifies the tuples to
be retrieved by the query
EXAMPLES
1. Retrieve the birth date and address of the employee(s) whose name is
‘John B. Smith’.
1. Retrieve the birth date and address of the employee(s) whose name is
‘John B. Smith’.

SELECT Bdate, Address


FROM EMPLOYEE
WHERE Fname=‘John’ AND Minit=‘B’ AND Lname=‘Smith’;
2. Retrieve the name and address of all employees who work for the ‘Research’
department
2. Retrieve the name and address of all employees who work for the ‘Research’
department
SELECT Fname, Lname, Address
FROM EMPLOYEE, DEPARTMENT
WHERE Dname=‘Research’ AND Dnumber=Dno;

1. Dname = ‘Research’ is a selection condition in the Department table


2. Dnumber = Dno is called a join condition, because it combines two
tuples: one from DEPARTMENT and one from EMPLOYEE
3. For every project located in ‘Stafford’, list the project number, the controlling
department number, and the department manager’s last name, address, and
birth date.
3. For every project located in ‘Stafford’, list the project number, the controlling
department number, and the department manager’s last name, address, and
birth date.

SELECT Pnumber, Dnum, Lname, Address, Bdate


FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE Dnum=Dnumber AND Mgr_ssn=Ssn AND Plocation=‘Stafford’;

Each tuple in the result will be a combination of one project, one department,
and one employee that satisfies the join conditions.
Ambiguous Attribute Names, Aliasing,
Renaming, and Tuple Variables

• In SQL, same name can be used for two or more attributes as long
as the attributes are in different relations.
• If this is the case, and a multi-table query refers to two or more
attributes with the same name, we must qualify the attribute name
with the relation name to prevent ambiguity.
• This is done by prefixing the relation name to the attribute name
and separating the two by a period.
Examples
Retrieve the name and address of all Students who work for the ‘CSE’
department

STUDENT
Name USN Bdate Address Sex CR_USN DNo

DEPARTMENT
Name DNo CR_USN
Examples
Retrieve the name and address of all Students who work for the ‘CSE’
department

STUDENT
Name USN Bdate Address Sex CR_USN DNo

DEPARTMENT
Name DNo CR_USN

SELECT STUDENT.Name, Address


FROM STUDENT, DEPARTMENT
WHERE DEPARTMENT.Name=‘CSE’ AND DEPARTMENT.DNo=STUDENT.DNo;
For each employee, retrieve the employee’s first and last name and the first
and last name of his or her immediate supervisor.
For each employee, retrieve the employee’s first and last name and the first
and last name of his or her immediate supervisor.

SELECT E.Fname, E.Lname, S.Fname, S.Lname


FROM EMPLOYEE AS E, EMPLOYEE AS S
WHERE E.Super_ssn = S.Ssn;
In this case, we are required to declare alternative relation names E and S, called
aliases or tuple variables, for the EMPLOYEE relation.
An alias can follow the keyword AS
Retrieve all the attribute values of any
EMPLOYEE who works in DEPARTMENT
number 5

Retrieve all the attributes of an EMPLOYEE


and the attributes of the DEPARTMENT in
which he or she works for every employee of
the ‘Research’ department
Retrieve all the attribute values of any EMPLOYEE who works in
DEPARTMENT number 5

SELECT * FROM EMPLOYEE


WHERE Dno = 5;

Retrieves all the attributes of an EMPLOYEE and the attributes of the


DEPARTMENT in which he or she works for every employee of the ‘Research’
department
SELECT * FROM
EMPLOYEE, DEPARTMENT
WHERE Dname = ‘Research’ AND Dno = Dnumber;
• SQL has directly incorporated some of the set operations from
mathematical set theory
• UNION
• EXCEPT
• INTERSECT
• The relations resulting from these set operations are sets of
tuples; that is, duplicate tuples are eliminated from the result
• These set operations apply only to type compatible relations:
• two relations on which we apply the operation have the same
attributes and that the attributes appear in the same order in
both relations
Make a list of all project numbers for projects that involve an employee whose
last name is ‘Smith’, either as a worker or as a manager of the department that
controls the project.

( SELECT DISTINCT Pnumber


FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE Dnum = Dnumber AND Mgr_ssn = Ssn
AND Lname = ‘Smith’ )
UNION
( SELECT DISTINCT Pnumber
FROM PROJECT, WORKS_ON, EMPLOYEE
WHERE Pnumber = Pno AND Essn = Ssn AND
Lname = ‘Smith’ );
Substring Pattern Matching and Arithmetic Operators

• The first feature allows comparison conditions on only parts of a


character string, using the LIKE comparison operator.
• This can be used for string pattern matching.
• Partial strings are specified using two reserved characters:
• % replaces an arbitrary number of zero or more characters,
• underscore (_) replaces a single character.

Example:
Retrieve all employees whose address is in Houston, Texas
Retrieve all employees whose address is in Houston, Texas

SELECT Fname, Lname


FROM EMPLOYEE
WHERE Address LIKE ‘%Houston,TX%’;
Show the resulting salaries if every employee
working on the ‘ProductX’ project is given a 10%
raise.

SELECT E.Fname, E.Lname, 1.1 * E.Salary AS Increased_sal


FROM EMPLOYEE AS E, WORKS_ON AS W, PROJECT AS P
WHERE E.Ssn = W.Essn AND W.Pno = P.Pnumber AND P.Pname = ‘ProductX’;
Retrieve all employees in department 5 whose salary is between $30,000 and
$40,000

SELECT *
FROM EMPLOYEE
WHERE (Salary BETWEEN 30000 AND 40000) AND Dno = 5;

(Salary >= 30000) AND (Salary <= 40000)


Retrieve a list of employees and the projects they are working on, ordered by
department and, within each department, ordered alphabetically by last
name, then first name.

SELECT D.Dname, E.Lname, E.Fname, P.Pname


FROM DEPARTMENT AS D, EMPLOYEE AS E, WORKS_ON AS W, PROJECT AS P
WHERE D.Dnumber = E.Dno AND E.Ssn = W.Essn AND W.Pno = P.Pnumber
ORDER BY D.Dname, E.Lname, E.Fname;

The default order is in ascending order of values. We can specify the


keyword DESC if we want to see the result in a descending order of values.
ORDER BY D.Dname DESC, E.Lname ASC, E.Fname ASC
Retrieve a list of employees and the
projects they are working on, ordered by
department and, within each department,
ordered alphabetically by last name, then
first name.
Retrieve a list of employees and the projects they are working on, ordered
by department and, within each department, ordered alphabetically by
last name, then first name.
INSERT, DELETE, and UPDATE Statements in SQL

INSERT INTO EMPLOYEE


VALUES ( ‘Richard’, ‘K’, ‘Marini’, ‘653298653’, ‘1962-12-30’,
‘98 Oak Forest, Katy, TX’, ‘M’, 37000, ‘653298653’, 4 );

INSERT INTO EMPLOYEE (Fname, Lname, Dno, Ssn) VALUES


(‘Richard’, ‘Marini’, 4, ‘653298653’);
The DELETE command removes tuples from a relation
The deletion may propagate to tuples in other relations if referential triggered
actions are specified in the referential integrity constraints of the DDL
DELETE FROM EMPLOYEE WHERE Lname=‘Wong’;
The UPDATE command is used to modify attribute values of one or more
selected Tuples
SET clause in the UPDATE command specifies the attributes to be modified
and their new values.
UPDATE EMPLOYEE
SET Salary = Salary * 1.1
WHERE Dno = 5;

UPDATE PROJECT
SET Plocation = ‘Bellaire’, Dnum = 5
WHERE Pnumber=10;
CHAPTER 3

CONCEPTUAL DATA MODELLING USING ENTITIES AND


RELATIONSHIPS
CHAPTER 3
• Entity Types, Entity Sets, Attributes, and Keys
• The ER model describes data as entities, relationships, and attributes.
• Entities and Attributes

• Entity, which is a thing or object in the real world with an independent existence.
• ✔ An entity may be an
• - object with a physical existence (for example, a particular person, car, house, or
employee)
• or
• - object with a conceptual existence (for instance, a company, a job, or a
university course)

• ✔ Each entity has attributes—the particular properties that describe it.


For example, an EMPLOYEE entity may be described by the employee’s
name, age, address, salary, and job
• The above fig shows two entities and the values of their attributes.
• ✔ The EMPLOYEE entity e1 has four attributes: Name, Address, Age,
and Home_phone; their values are ‘John Smith,’ ‘2311 Kirby, Houston,
Texas 77001’, ‘55’, and ‘713-749-2630’, respectively.
• ✔ The COMPANY entity c1 has three attributes: Name, Headquarters,
and President; their values are ‘Sunco Oil’, ‘Houston’, and ‘John Smith’,
respectively.
Attributes

Composite attributes Simple Attributes


• Can be divided into • Attributes that are not
smaller subparts. divisible

• Ex: Name 🡪 First Name, • Ex: Weight🡪 cannot be


Middle Name, Last Name further divided
Attributes

Single-valued Multivalued
• Have a single value for a • An entity having multiple
particular entity . values for that attribute

• Ex: Age🡪 Single -valued • Ex: College Degree,


attribute of a person languages known🡪
multivalued attributes of
a person
Attributes
Derived attribute
Stored attribute
• Can be derived from other • From which the value of other
attributes. attributes are derived.

• Ex: Age🡪 can be derived from • Ex: BirthDate of a person


date of birth
Complex Attributes

• Has composite and multivalued components in it.


• Multivalued attributes🡪 represented within ‘{ }’
• Composite Attributes🡪 represented within ‘( )’
• Ex:{College Degrees(college , year , Degree , Field)}
•NULL Values:
•Null is something which is not applicable or
unknown
Entity Types, Entity Sets, Keys, and Value Sets

• Entity Types
• A collection (or set) of entities that have the same attributes .
• Ex: STUDENT
Entity set or Entity collection

• The collection of all entities of a particular entity type in the database at any
point in time
Key Attributes of an Entity Type

• Attribute that is capable of identifying each entity uniquely.


Ex: USN of a student


• The Name attribute is a key of the COMPANY entity type in because no two
companies are allowed to have the same name

• For the PERSON entity type, a typical key attribute is Ssn


Value Sets (Domains) of Attributes

• The set of values that may be assigned to that attribute for each
individual entity

• If the range of ages allowed for employees is between 16 and 70, we


can specify the value set of the Age attribute of EMPLOYEE to be the
set of integer numbers between 16 and 70
Weak Entity Types

• Entity types that do not have key attributes of their own are called weak entity types

• In contrast, regular entity types that do have a key attribute—are called strong entity
types

• We call this other entity type the identifying or owner entity type, and we call the
relationship type that relates a weak entity type to its owner the identifying relationship of
the weak entity type
• A weak entity type normally has a partial key, which is
the attribute that can uniquely identify weak entities
that are related to the same owner entity

• A weak entity type normally has a partial key, which is


the attribute that can uniquely identify weak entities
that are related to the same owner entity
RELATIONSHIPs

• Relationship 🡪 association among 2 or more entities


• Ex: Teacher teaches Student.
• Degree of relationship : denotes the number of entity types that
participate in a relationship.
• Binary Relationship: 🡪 Exists when there is association among
two entities.
• Ternary Relationship: 🡪 Exists when there is association among three entities.
the employees e1, e3, and e6 work for department d1
the employees e2 and e4 work for department d2; and the employees e5 and e7 work for
department d3
• Employee participating in a WORKS-FOR relationship
with Department

• Each entity type that participates in a relationship type


plays a particular role in the relationship

• For example, in the WORKS_FOR relationship type,


EMPLOYEE plays the role of employee or worker and
DEPARTMENT plays the role of department or employer
• Same entity type participates more than once in a
relationship type in different roles, such relationship types
are called recursive relationships
• Relationship in which one entity type participating in a
relationship type with itself is called recursive relationship
• 1 🡪 signifies employee entity plays the role of supervisor
• 2 🡪 signifies employee entity plays the role of supervisee
(subordinate)
• Ex. e1 is the supervisor of e2
• e2 is the subordinate of e1.
• e1 is the subordinate of e5
• e5 is the supervisor of e1
Structural Constraints

• Two main types of binary relationship constraints:


• Cardinality ratio
• Participation constraint
Cardinality Ratios for Binary Relationships
Cardinality Ratios for Binary Relationships

• The cardinality ratio for a binary relationship specifies the


maximum number of relationship instances that an entity can
participate in.

• The possible cardinality ratios for binary relationship types are


1:1, 1:N, N:1, and M:N
Participation Constraints

• Specifies the minimum number of relationship instances that each


entity can participate in and is sometimes called the minimum
cardinality constraint

• There are two types of participation constraints—total and partial


• Multivalued Attribute:

• Composite attribute:
ER Diagram
•Consider the ER diagram shown in the figure for
part of a BANK database . Each bank can have
multiple branches, and each branches can have
multiple accounts and loans.

•A. List the non weak entity types in the ER


diagram.
•Is there a weak entity type? If so, give its name ,
partial key , identifying relationship

• Sol : Yes.
• Weak Entity Type:
• Partial Key: Branch No
• Identifying Relationship: Branches
• Constraint of the identifying relationship:
i) The weak entity set must have total participation the
identifying relationship set , Branches

ii)The identifying relationship between the Bank and bank


Branches must be one to many
- means that Bank branches only have one Bank as its own
These are the constraints in the partial key and identifying
relationship
•List the names of all relationship types and
specify the constraint on each participation of an
entity type in a relationship type.
SPECIALIZATION AND GENERALIZATION
• Specialization is the process of defining a set of subclasses of an entity
type; this entity type is called the superclass of the specialization.

• The set of subclasses that forms a specialization is defined on the basis


of some distinguishing characteristic of the entities in the superclass.

• For example, the set of subclasses {SECRETARY, ENGINEER,


TECHNICIAN} is a specialization of the superclass EMPLOYEE that
distinguishes among employee entities based on the job type of each
employee.
Generalization
• One can think of a reverse process of abstraction in which suppress the differences
among several entity types, identify their common features, and generalize them into a
single superclass of which the original entity types are special subclasses.

• For example, consider the entity types CAR and TRUCK shown in below figure . Because
they have several common attributes, they can be generalized into the entity type
VEHICLE, as shown in Figure.

• Both CAR and TRUCK are now subclasses of the generalized superclass VEHICLE. We
use the term generalization to refer to the process of defining a generalized entity type
from the given entity types.

You might also like