0% found this document useful (0 votes)
26 views

Databases-Draft 1

The document discusses the differences between files and databases. It explains that a file contains records about a single entity, like a student file containing records of individual students. A database is a collection of interrelated files managed by a database management system. This allows for integration of data across files, easier sharing of data, and more flexible reporting compared to separate files. The document gives examples of how databases are used by large organizations to manage large amounts of integrated data.

Uploaded by

malvin muthoni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Databases-Draft 1

The document discusses the differences between files and databases. It explains that a file contains records about a single entity, like a student file containing records of individual students. A database is a collection of interrelated files managed by a database management system. This allows for integration of data across files, easier sharing of data, and more flexible reporting compared to separate files. The document gives examples of how databases are used by large organizations to manage large amounts of integrated data.

Uploaded by

malvin muthoni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 87

FILE VS.

DATABASES
• Let’s examine some basic principles about how data are
stored in computer systems.
– An entity is anything about which the organization wishes to
store data. At your college or university, one entity would be the
student.

STUDENTS

Phone
Student ID Last Name First Name Number Birth Date

333-33-3333 Simpson Alice 333-3333 10/11/84

111-11-1111 Sanders Ned 444-4444 11/24/86

123-45-6789 Moore Artie 555-5555 04/20/85

1 of 96
FILE VS. DATABASES
– Information about the attributes of an entity (e.g., the
student’s ID number and birth date) are stored in
fields.

STUDENTS

Phone
Student ID Last Name First Name Number Birth Date

333-33-3333 Simpson Alice 333-3333 10/11/84

111-11-1111 Sanders Ned 444-4444 11/24/86

123-45-6789 Moore Artie 555-5555 04/20/85

2 of 96
FILE VS. DATABASES
– All the fields containing data about one entity (e.g.,
one student) form a record.
– The example below shows the record for Artie Moore.

STUDENTS

Phone
Student ID Last Name First Name Number Birth Date

333-33-3333 Simpson Alice 333-3333 10/11/84

111-11-1111 Sanders Ned 444-4444 11/24/86

123-45-6789 Moore Artie 555-5555 04/20/85

3 of 96
FILE VS. DATABASES
– A set of all related records forms a file (e.g., the
student file).
– If this university only had three students and five
fields for each student, then the entire file would be
depicted below.

STUDENTS

Phone
Student ID Last Name First Name Number Birth Date

333-33-3333 Simpson Alice 333-3333 10/11/84

111-11-1111 Sanders Ned 444-4444 11/24/86

123-45-6789 Moore Artie 555-5555 04/20/85

4 of 96
FILE VS. DATABASES
– A set of interrelated, centrally coordinated files forms
a database.

Student Class
File File

Advisor
File
5 of 96
FILE VS. DATABASES

• Database systems were developed to


address the problems associated with the
proliferation of master files.
– For years, each time a new information need
arose, companies created new files and
programs.
– The result: a significant increase in the
number of master files.

6 of 96
FILE VS. DATABASES
• This proliferation of master
Master File 1 Enrollment files created problems:
Fact A Program
Fact B – Often the same information was
Fact C stored in multiple master files.
– Made it more difficult to
effectively integrate data and
obtain an organization-wide view
Master File 2
Financial Aid of the data.
Fact A
Fact D Program – Also, the same information may
Fact F not have been consistent
between files.
• If a student changed his
phone number, it may have
Master File 1 Grades been updated in one master
Fact A
Fact B
Program file but not another.
Fact F

7 of 96
FILE VS. DATABASES

Database
• A database is a set
Fact A Fact B
Fact C Fact D of inter-related,
Fact E Fact F
centrally
coordinated files.
Database
Management
System

Enrollment Financial Aid Grades


Program Program Program

8 of 96
FILE VS. DATABASES
• The database approach
Database treats data as an
Fact A Fact B
Fact C Fact D
organizational resource
Fact E Fact F that should be used by
and managed for the
entire organization, not
just a particular
Database department.
Management
System • A database management
system (DBMS) serves
as the interface between
the database and the
Enrollment Financial Aid Grades
Program Program Program
various application
programs.

9 of 96
FILE VS. DATABASES

Database
• The combination of
Fact A Fact B
Fact C Fact D the database, the
Fact E Fact F
DBMS, and the
application
Database
Management
programs that
System access the
database is
Enrollment
Program
Financial Aid
Program
Grades
Program
referred to as the
database system.

10 of 96
FILE VS. DATABASES

Database
• The person
Fact A Fact B
Fact C Fact D
responsible for the
Fact E Fact F database is the
• Hewlett- database
Packard is
replacing 784 administrator.
Database
Management • As technology
databases with
a single,
System
company-wide
improves, many large
database. companies are
Enrollment Financial Aid Grades
developing very large
Program Program Program databases called data
warehouses.

11 of 96
IMPORTANCE AND ADVANTAGES OF
DATABASE SYSTEMS
• Database technology is everywhere.
– Virtually all mainframe computer sites use
database technology.
– Use of databases with PCs is growing also.
– Cloud Storage uses databases

12 of 96
IMPORTANCE AND ADVANTAGES OF
DATABASE SYSTEMS
• Database technology provides the
following benefits to organizations:
– Data integration • Achieved by combining
master files into larger
pools of data accessible
by many programs.

13 of 96
IMPORTANCE AND ADVANTAGES OF
DATABASE SYSTEMS

• Database technology provides the


following benefits to organizations:
– Data integration
– Data sharing
• It’s easier to share data that’s integrated—
the FBI is planning an 8 year, $400 million
database project to make data more
available to agency users.

14 of 96
IMPORTANCE AND ADVANTAGES OF
DATABASE SYSTEMS
• Database technology provides the
following benefits to organizations:
– Data integration
– Data sharing
– Reporting flexibility
• Reports can be revised easily and
generated as needed.
• The database can easily be browsed to
research problems or obtain detailed
information.

15 of 96
IMPORTANCE AND ADVANTAGES OF
DATABASE SYSTEMS
• Database technology provides the
following benefits to organizations:
– Data integration
– Data sharing
– Reporting flexibility
– Minimal data redundancy and
inconsistencies • Because data items are
usually stored only once.

16 of 96
IMPORTANCE AND ADVANTAGES OF
DATABASE SYSTEMS
• Database technology provides the
following benefits to organizations:
– Data integration
• Data items are independent of the programs that
use them.
– Data sharing
• Consequently, a data item can be changed
– Reporting flexibility
without changing the program and vice versa.
• Makes programming easier and simplifies data
– Minimal data redundancy and inconsistencies
management.
– Data independence

17 of 96
IMPORTANCE AND ADVANTAGES OF
DATABASE SYSTEMS
• Database technology provides the
following benefits to organizations:
– Data integration
– Data sharing
– Reporting flexibility
• Data management is more efficient
because the database
– Minimal data redundancy administrator is
and inconsistencies
responsible for coordinating, controlling,
– Data independence
and managing data.

– Central management of data

18 of 96
IMPORTANCE AND ADVANTAGES OF
DATABASE SYSTEMS
• Database technology provides the
following benefits to organizations:
– Data integration
– Data sharing
– Reporting flexibility
• Relationships can be explicitly defined and
– Minimal dataused
redundancy and inconsistencies
in the preparation of management
reports.
– Data independence
• EXAMPLE: Relationship between selling
– Central management of data campaigns.
costs and promotional

– Cross-functional analysis
19 of 96
IMPORTANCE AND ADVANTAGES OF
DATABASE SYSTEMS
• The importance of good data:
– Bad data leads to:
• Bad decisions
• Embarrassment
• Angry users
 Data Warehousing Institute estimates that
dirty data costs $600 billion per year in
unnecessary postage, marketing costs, and
lost customer credibility.

20 of 96
DATABASE SYSTEMS

• Logical and physical views of data


– In file-oriented systems, programmers must
know the physical location and layout of
records used by a program.
• They must reference the location, length, and
format of every field they utilize.
• When data is used from several files, this process
becomes more complex.

21 of 96
DATABASE SYSTEMS

• Database systems overcome this problem


by separating the storage and use of data
elements.
– Two separate views of the data are provided:
• Logical view
 How the user or programmer conceptually
organizes and understands the data.

22 of 96
DATABASE SYSTEMS

• Database systems overcome this problem


by separating the storage and use of data
elements.
– Two separate views of the data are provided:
• Logical view
• Physical view
 How and where the data are physically
arranged and stored.

23 of 96
DATABASE SYSTEMS

• Database systems overcome this problem


by separating the storage and use of data
elements.
– Two separate views of the data are provided:
• Logical view
• Physical view
– Separating these views facilitates
application development, because
programmers can focus on coding the
logic and not be concerned with storage
details.
24 of 96
Logical View—User A Logical View—User B
Scholarship Distribution
Enrollment by Class Fr.
5%
Sr. Soph.
33% 24%

Jr.
38%

The DBMS translates


DBMS users’ logical views
into instructions as to
which data should be
Operating retrieved from the
System database.

Database

25 of 96
Logical View—User A Logical View—User B
Scholarship Distribution
Enrollment by Class Fr.
5%
Sr. Soph.
33% 24%

Jr.
38%

DBMS
The operating system
Operating translates DBMS
System requests into
instructions to
physically retrieve
data from various
Database disks.

26 of 96
DATABASE SYSTEMS

• The DBMS handles the link between the


physical and logical views of the data.
– Allows the user to access, query, and update
data without reference to how or where it is
physically stored.
– The user only needs to define the logical data
requirements.

27 of 96
DATABASE SYSTEMS

• Separating the logical and physical views of data


also means users can change their
conceptualizations of the data relationships
without making changes in the physical storage.
• The database administrator can also change the
physical storage of the data without affecting
users or application programs.

28 of 96
DATABASE SYSTEMS

• Schemas
– A schema describes the logical structure of a
database.
– There are three levels of schema.
• Conceptual level
• The organization-wide view of the entire
database—i.e., the big picture.
• Lists all data elements and the relationships
between them.

29 of 96
Subschema--User A Subschema--User B Subschema--User C
Smith . . . A
Jones . . . B
Arnold . . .D

Mapping external-level views to conceptual-level schema

Classes Enroll Student

Cash
Receipt

Mapping conceptual-level items to internal-level descriptions

Student Record Class Record


Student No. --character [9] Class Name --character [9]
Student Name --character [26] Dept No. --integer [4], non-null, index=itemx
SAT Score --integer [2], non-null, index=itemx Course No. --integer [4], non-null, index=itemx
30 of 96
DATABASE SYSTEMS

• Schemas
– A schema describes the logical structure of a
database.
– There are three levels of schema.
• Conceptual level
• External level
• A set of individual user views of portions of
the database, i.e., how each user sees the
portion of the system with which he
interacts.
• These individual views are referred to as
subschema.
31 of 96
Subschema--User A Subschema--User B Subschema--User C
Smith . . . A
Jones . . . B
Arnold . . .D

Mapping external-level views to conceptual-level schema

Classes Enroll Student

Cash
Receipt

Mapping conceptual-level items to internal-level descriptions

Student Record Class Record


Student No. --character [9] Class Name --character [9]
Student Name --character [26] Dept No. --integer [4], non-null, index=itemx
SAT Score --integer [2], non-null, index=itemx Course No. --integer [4], non-null, index=itemx
32 of 96
DATABASE SYSTEMS

• Schemas
– A schema describes the logical structure of a
database.
– There are three levels of schema.
• Conceptual level• A low-level view of the database.
• External level • It describes how the data are actually
stored and accessed including:
• Internal level
– Record layouts
– Definitions
– Addresses
– Indexes

33 of 96
Subschema--User A Subschema--User B Subschema--User C
Smith . . . A
Jones . . . B
Arnold . . .D

Mapping external-level views to conceptual-level schema

Classes Enroll Student

Cash
Receipt

Mapping conceptual-level items to internal-level descriptions

Student Record Class Record


Student No. --character [9] Class Name --character [9]
Student Name --character [26] Dept No. --integer [4], non-null, index=itemx
SAT Score --integer [2], non-null, index=itemx Course No. --integer [4], non-null, index=itemx
34 of 96
Subschema--User A Subschema--User B Subschema--User C
Smith . . . A
Jones . . . B
Arnold . . .D

Mapping external-level views to conceptual-level schema

The
Classes Enroll Student bidirectional
arrows
represent
mappings
Cash
between the
Receipt
schema.

Mapping conceptual-level items to internal-level descriptions

Student Record Class Record


Student No. --character [9] Class Name --character [9]
Student Name --character [26] Dept No. --integer [4], non-null, index=itemx
SAT Score --integer [2], non-null, index=itemx Course No. --integer [4], non-null, index=itemx
35 of 96
DATABASE SYSTEMS

• The DBMS uses the mappings to translate


a request by a user or program for data
(expressed in logical names and
relationships) into the indexes and
addresses needed to physically access
the data.

36 of 96
DATABASE SYSTEMS

• An employee’s access to data should be


limited to the subschema of data that is
relevant to the performance of his job.

37 of 96
DATABASE SYSTEMS

• The data dictionary


– A key component of a DBMS is the data
dictionary.
• Contains information about the structure of the
database.
• For each data element, there is a corresponding
record in the data dictionary describing that
element.

38 of 96
DATABASE SYSTEMS

• Information provided for each element includes:


– A description or explanation of the element.
– The records in which it is contained.
– Its source.
– The length and type of the field in which it is stored.
– The programs in which it is used.
– The outputs in which it is contained.
– The authorized users of the element.
– Other names for the element.

39 of 96
DATABASE SYSTEMS
• The DBMS usually maintains the data dictionary.
– It is often one of the first applications of a newly
implemented database system.
– Inputs to the dictionary include:
• Records of new or deleted data elements.
• Changes in names, descriptions, or uses of existing
elements.
– Outputs include:
• Reports that are useful to programmers, database designers,
and IS users in:
– Designing and implementing the system.
– Documenting the system.
– Creating an audit trail.

40 of 96
DATABASE SYSTEMS

• DBMS Languages
– Every DBMS must provide a means of
performing the three basic functions of:
• Creating a database
• Changing a database
• Querying a database

41 of 96
DATABASE SYSTEMS

• DBMS Languages
– Every DBMS must provide a means of
performing the three basic functions of:
• Creating a database
• Changing a database
• Querying a database

42 of 96
DATABASE SYSTEMS

• Creating a database:
– The set of commands used to create the
database is known as data definition
language (DDL). DDL is used to:
• Build the data dictionary
• Initialize or create the database
• Describe the logical views for each individual user
or programmer
• Specify any limitations or constraints on security
imposed on database records or fields

43 of 96
DATABASE SYSTEMS

• DBMS Languages
– Every DBMS must provide a means of
performing the three basic functions of:
• Creating a database
• Changing a database
• Querying a database

44 of 96
DATABASE SYSTEMS

• Changing a database
– The set of commands used to change the
database is known as data manipulation
language (DML). DML is used for
maintaining the data including:
• Updating data
• Inserting data
• Deleting portions of the database

45 of 96
DATABASE SYSTEMS

• DBMS Languages
– Every DBMS must provide a means of
performing the three basic functions of:
• Creating a database
• Changing a database
• Querying a database

46 of 96
DATABASE SYSTEMS

• Querying a database:
– The set of commands used to query the database is
known as data query language (DQL). DQL is used
to interrogate the database, including:
• Retrieving records
• Sorting records
• Ordering records
• Presenting subsets of the database
– The DQL usually contains easy-to-use, powerful
commands that enable users to satisfy their own
information needs.

47 of 96
DATABASE SYSTEMS

• Report Writer
– Many DBMS packages also include a report writer, a
language that simplifies the creation of reports.
– Users typically specify:
• What elements they want printed
• How the report should be formatted
– The report writer then:
• Searches the database
• Extracts specified data
• Prints them out according to specified format

48 of 96
DATABASE SYSTEMS

• Users typically have access to both DQL and


report writer.
• Access to DQL and DML are typically restricted
to employees with administrative and
programming responsibilities.

49 of 96
RELATIONAL DATABASES

• A DBMS is characterized by the type of


logical data model on which it is based.
– A data model is an abstract representation of
the contents of a database.
– Most new DBMSs are called relational
databases because they use the relational
model developed by E. F. Codd in 1970.

50 of 96
RELATIONAL DATABASES

• The relational data model represents


everything in the database as being stored
in the forms of tables (aka, relations).

51 of 96
STUDENTS
Last First Phone
Student ID Name Name No.
333-33-3333 Simpson Alice 333-3333
Relation
111-11-1111 Sanders Ned 444-4444
123-45-6789 Moore Artie 555-5555

COURSES
Course ID Course Section Day Time
1234 ACCT-3603 1 MWF 8:30
1235 ACCT-3603 2 TR 9:30
1236 MGMT-2103 1 MW 8:30

STUDENT x COURSE
SCID Student ID Course
333333333-1234 333-33-3333 1234
333333333-1236 333-33-3333 1236
111111111-1235 111-11-1111 1235
111111111-1236 111-11-1111 1235
52 of 96
RELATIONAL DATABASES

• This model only describes how the data


appear in the conceptual- and external-
level schemas.
• The data are physically stored according
to the description in the internal-level
schema.

53 of 96
STUDENTS
Last First Phone
Student ID Name Name No.
333-33-3333 Simpson Alice 333-3333 Each row is
111-11-1111 Sanders Ned 444-4444 called a tuple
123-45-6789 Moore Artie 555-5555

COURSES
Course ID Course Section Day Time
1234 ACCT-3603 1 MWF 8:30
1235 ACCT-3603 2 TR 9:30
1236 MGMT-2103 1 MW 8:30

STUDENT x COURSE
SCID
333333333-1234
333333333-1236
111111111-1235
111111111-1236
54 of 96
Each row
STUDENTS
contains data
Last First Phone
about a specific
Student ID Name Name No.
occurrence of
333-33-3333 Simpson Alice 333-3333
the type of entity
111-11-1111 Sanders Ned 444-4444
in the table.
123-45-6789 Moore Artie 555-5555

COURSES
Course ID Course Section Day Time
1234 ACCT-3603 1 MWF 8:30
1235 ACCT-3603 2 TR 9:30
1236 MGMT-2103 1 MW 8:30
STUDENT x COURSE
SCID
333333333-1234
333333333-1236
111111111-1235
111111111-1236
55 of 96
STUDENTS Each column in
Last First Phone a table contains
Student ID Name Name No. information
333-33-3333 Simpson Alice 333-3333 about a specific
111-11-1111 Sanders Ned 444-4444 attribute of the
123-45-6789 Moore Artie 555-5555 entity.

COURSES
Course ID Course Section Day Time
1234 ACCT-3603 1 MWF 8:30
1235 ACCT-3603 2 TR 9:30
1236 MGMT-2103 1 MW 8:30
STUDENT x COURSE
SCID
333333333-1234
333333333-1236
111111111-1235
111111111-1236
56 of 96
STUDENTS
Last First Phone
Student ID Name Name No.
333-33-3333 Simpson Alice 333-3333
111-11-1111 Sanders Ned 444-4444
123-45-6789 Moore Artie 555-5555

COURSES
Course ID Course Section Day Time
1234 ACCT-3603 1 MWF 8:30
1235 ACCT-3603 2 TR 9:30
1236 MGMT-2103 1 MW 8:30
STUDENT x COURSE
A primary key is the
SCID attribute or combination
333333333-1234 of attributes that
333333333-1236 uniquely identifies a
111111111-1235 specific row in a table.
111111111-1236
57 of 96
STUDENTS
Last First Phone
Student ID Name Name No.
333-33-3333 Simpson Alice 333-3333
111-11-1111 Sanders Ned 444-4444
123-45-6789 Moore Artie 555-5555

COURSES
Course ID Course Section Day Time
1234 ACCT-3603 1 MWF 8:30
1235 ACCT-3603 2 TR 9:30
1236 MGMT-2103 1 MW 8:30
STUDENT x COURSE
SCID
333333333-1234 In some tables, two or more attributes
333333333-1236 may be joined to form the primary key.
111111111-1235
111111111-1236
58 of 96
STUDENTS
First Advisor
Student ID Last Name Name Phone No. No.
333-33-3333 Simpson Alice 333-3333 1418
111-11-1111 Sanders Ned 444-4444 1418
123-45-6789 Moore Artie 555-5555 1503

ADVISORS
Advisor No. Last Name First Name Office No.
1418 Howard Glen 420
1419 Melton Amy 316
1503 Zhang Xi 202
1506 Radowski J.D. 203

A foreign key is an attribute in one table that is a primary key in


another table.

59 of 96
STUDENTS
First Advisor
Student ID Last Name Name Phone No. No.
333-33-3333 Simpson Alice 333-3333 1418
111-11-1111 Sanders Ned 444-4444 1418
123-45-6789 Moore Artie 555-5555 1503

ADVISORS
Advisor No. Last Name First Name Office No.
1418 Howard Glen 420
1419 Melton Amy 316
1503 Zhang Xi 202
1506 Radowski J.D. 203

Foreign keys are used to link tables together.

60 of 96
STUDENTS
First Advisor
Student ID Last Name Name Phone No. No.
333-33-3333 Simpson Alice 333-3333 1418
111-11-1111 Sanders Ned 444-4444 1418
123-45-6789 Moore Artie 555-5555 1503

ADVISORS
Advisor No. Last Name First Name Office No.
1418 Howard Glen 420
1419 Melton Amy 316
1503 Zhang Xi 202
1506 Radowski J.D. 203

Other non-key attributes in each table store important


information about the entity.

61 of 96
RELATIONAL DATABASES

• Alternatives for storing data


– One possible alternate approach would be to
store all data in one uniform table.
– For example, instead of separate tables for
students and classes, we could store all data
in one table and have a separate line for each
student x class combination.

62 of 96
Last First
Student ID Name Name Phone No. Course No. Section Day Time
333-33-3333 Simpson Alice 333-3333 ACCT-3603 1 M 9:00 AM
333-33-3333 Simpson Alice 333-3333 FIN-3213 3 Th 11:00 AM
333-33-3333 Simpson Alice 333-3333 MGMT-3021 11 Th 12:00 PM
111-11-1111 Sanders Ned 444-4444 ACCT-3433 2 T 10:00 AM
111-11-1111 Sanders Ned 444-4444 MGMT-3021 5 W 8:00 AM
111-11-1111 Sanders Ned 444-4444 ANSI-1422 7 F 9:00 AM
123-45-6789 Moore Artie 555-5555 ACCT-3433 2 T 10:00 AM
123-45-6789 Moore Artie 555-5555 FIN-3213 3 Th 11:00 AM

• Using the suggested approach, a student taking three classes


would need three rows in the table.
• In the above, simplified example, a number of problems arise.

63 of 96
Last First
Student ID Name Name Phone No. Course No. Section Day Time
333-33-3333 Simpson Alice 333-3333 ACCT-3603 1 M 9:00 AM
333-33-3333 Simpson Alice 333-3333 FIN-3213 3 Th 11:00 AM
333-33-3333 Simpson Alice 333-3333 MGMT-3021 11 Th 12:00 PM
111-11-1111 Sanders Ned 444-4444 ACCT-3433 2 T 10:00 AM
111-11-1111 Sanders Ned 444-4444 MGMT-3021 5 W 8:00 AM
111-11-1111 Sanders Ned 444-4444 ANSI-1422 7 F 9:00 AM
123-45-6789 Moore Artie 555-5555 ACCT-3433 2 T 10:00 AM
123-45-6789 Moore Artie 555-5555 FIN-3213 3 Th 11:00 AM

• Suppose Alice Simpson changes her phone number. You need to


make the change in three places. If you fail to change it in all three
places or change it incorrectly in one place, then the records for
Alice will be inconsistent.
• This problem is referred to as an update anomaly.

64 of 96
Last First
Student ID Name Name Phone No. Course No. Section Day Time
333-33-3333 Simpson Alice 333-3333 ACCT-3603 1 M 9:00 AM
333-33-3333 Simpson Alice 333-3333 FIN-3213 3 Th 11:00 AM
333-33-3333 Simpson Alice 333-3333 MGMT-3021 11 Th 12:00 PM
111-11-1111 Sanders Ned 444-4444 ACCT-3433 2 T 10:00 AM
111-11-1111 Sanders Ned 444-4444 MGMT-3021 5 W 8:00 AM
111-11-1111 Sanders Ned 444-4444 ANSI-1422 7 F 9:00 AM
123-45-6789 Moore Artie 555-5555 ACCT-3433 2 T 10:00 AM
123-45-6789 Moore Artie 555-5555 FIN-3213 3 Th 11:00 AM

• What happens if you have a new student to add, but he hasn’t


signed up for any courses yet?
• Or what if there is a new class to add, but there are no students
enrolled in it yet? In either case, the record will be partially blank.
• This problem is referred to as an insert anomaly.

65 of 96
Last First
Student ID Name Name Phone No. Course No. Section Day Time
333-33-3333 Simpson Alice 333-3333 ACCT-3603 1 M 9:00 AM
333-33-3333 Simpson Alice 333-3333 FIN-3213 3 Th 11:00 AM
333-33-3333 Simpson Alice 333-3333 MGMT-3021 11 Th 12:00 PM
111-11-1111 Sanders Ned 444-4444 ACCT-3433 2 T 10:00 AM
111-11-1111 Sanders Ned 444-4444 MGMT-3021 5 W 8:00 AM
111-11-1111 Sanders Ned 444-4444 ANSI-1422 7 F 9:00 AM
123-45-6789 Moore Artie 555-5555 ACCT-3433 2 T 10:00 AM
123-45-6789 Moore Artie 555-5555 FIN-3213 3 Th 11:00 AM

• If Ned withdraws from all his classes and you eliminate all three of
his rows from the table, then you will no longer have a record of
Ned. If Ned is planning to take classes next semester, then you
probably didn’t really want to delete all records of him.
• This problem is referred to as a delete anomaly.

66 of 96
RELATIONAL DATABASES

• Alternatives for storing data


– Another possible approach would be to store
each student in one row of the table and
create multiple columns to accommodate
each class that he is taking.

67 of 96
Last First Phone
Student ID Name Name No. Class 1 Class 2 Class 3 Class 4

333-33-3333 Simpson Alice 333-3333 ACCT-3603 FIN-3213 MGMT-3021

111-11-1111 Sanders Ned 444-4444 ACCT-3433 MGMT-3021 ANSI-1422

123-45-6789 Moore Artie 555-5555 ACCT-3433 FIN-3213

• This approach is also fraught with problems:


– How many classes should you allow in building the table?
– The above table is quite simplified. In reality, you might need to
allow for 20 or more classes (assuming a student could take
many 1-hour classes). Also, more information than just the
course number would be stored for each class. There would be
a great deal of wasted space for all the students taking fewer
than the maximum possible number of classes.
– Also, if you wanted a list of every student taking MGMT-3021,
notice that you would have to search multiple attributes.

68 of 96
STUDENTS
Last First Phone
Student ID Name Name No.
333-33-3333 Simpson Alice 333-3333
111-11-1111 Sanders Ned 444-4444
123-45-6789 Moore Artie 555-5555

COURSES
Course ID Course Section Day Time
1234 ACCT-3603 1 MWF 8:30
1235 ACCT-3603 2 TR 9:30
1236 MGMT-2103 1 MW 8:30
STUDENT x COURSE • The solution to the preceding problems
SCID is to use a set of tables in a relational
333333333-1234 database.
333333333-1236 • Each entity is stored in a separate table,
111111111-1235 and separate tables or foreign keys can
111111111-1236 be used to link the entities together.
69 of 96
RELATIONAL DATABASES

• Basic requirements of a relational database


– Every column in a row must be single valued.
• In other words, every cell can have one and only
one value.
• In the student table, you couldn’t have an attribute
named “Phone Number” if a student could have
multiple phone numbers.
• There might be an attribute named “local phone
number” and an attribute named “permanent
phone number.”
• You could not have an attribute named “Class” in
the student table, because a student could take
multiple classes.
70 of 96
RELATIONAL DATABASES

• Basic requirements of a relational


database
– The primary key cannot be null.
• The primary key uniquely identifies a specific row
in the table, so it cannot be null, and it must be
unique for every record.
• This rule is referred to as the entity integrity rule.

71 of 96
STUDENTS
Last First Phone
Student ID Name Name No.
333-33-3333 Simpson Alice 333-3333
111-11-1111 Sanders Ned 444-4444
123-45-6789 Moore Artie 555-5555

COURSES
Course ID Course Section Day Time
1234 ACCT-3603 1 MWF 8:30
1235 ACCT-3603 2 TR 9:30
1236 MGMT-2103 1 MW 8:30
STUDENT x COURSE
SCID • Note that within each table, there are no
333333333-1234 duplicate primary keys and no null
333333333-1236 primary keys.
111111111-1235 • Consistent with the entity integrity rule.
111111111-1236
72 of 96
RELATIONAL DATABASES

• Basic requirements of a relational


database
– A foreign key must either be null or
correspond to the value of a primary key in
another table.
• This rule is referred to as the referential integrity
rule.
• The rule is necessary because foreign keys are
used to link rows in one table to rows in another
table.

73 of 96
STUDENTS
First Advisor
Student ID Last Name Name Phone No. No.
333-33-3333 Simpson Alice 333-3333 1418
111-11-1111 Sanders Ned 444-4444 1418
123-45-6789 Moore Artie 555-5555 1503

ADVISORS
Advisor No. Last Name First Name Office No.
1418 Howard Glen 420
1419 Melton Amy 316
1503 Zhang Xi 202
1506 Radowski J.D. 203

Advisor No. is a foreign key in the STUDENTS table. Every


incident of Advisor No. in the STUDENTS table either matches
an instance of the primary key in the ADVISORS table or is null.

74 of 96
RELATIONAL DATABASES

• Basic requirements of a relational


database
– All non-key attributes in a table should
describe a characteristic of the object
identified by the primary key.
• Could nationality be a non-key attribute in the
student table?
• Could advisor’s nationality be a non-key attribute
in the student table?

75 of 96
RELATIONAL DATABASES

• The preceding four constraints produce a well-


structured (normalized) database in which:
– Data are consistent.
– Redundancy is minimized and controlled.
• In a normalized database, attributes appear
multiple times only when they function as foreign
keys.
• The referential integrity rule ensures there will
be no update anomaly problem with foreign
keys.

76 of 96
RELATIONAL DATABASES
• An important feature is that data about various things of
interest (entities) are stored in separate tables.
– Makes it easier to add new data to the system.
• You add a new student by adding a row to the student
table.
• You add a new course by adding a row to the course
table.
• Means you can add a student even if he hasn’t signed
up for any courses.
• And you can add a class even if no students are yet
enrolled in it.
– Makes it easy to avoid the insert anomaly.
• Space is also used more efficiently than in the other
schemes. There should be no blank rows or attributes.

77 of 96
• Add a
STUDENTS
student
Last First Phone
here.
Student ID Name Name No.
333-33-3333 Simpson Alice 333-3333 • Leaves no
111-11-1111 Sanders Ned 444-4444 blank
123-45-6789 Moore Artie 555-5555 spaces.

COURSES
Course ID Course Section Day Time
1234 ACCT-3603 1 MWF 8:30
1235 ACCT-3603 2 TR 9:30
1236 MGMT-2103 1 MW 8:30
STUDENT x COURSE
• Add a course here.
SCID
333333333-1234 • Leaves no blank spaces.
333333333-1236
111111111-1235 • When a particular student enrolls for a
111111111-1236 particular course, add that info here.
78 of 96
RELATIONAL DATABASES

• Deletion of a class for a student would


cause the elimination of one record in the
student x class table.
– The student still exists in the student table.
– The class still exists in the class table.
– Avoids the delete anomaly.

79 of 96
STUDENTS
Last First Phone • Ned still
Student ID Name Name No. exists in
333-33-3333 Simpson Alice 333-3333 the
111-11-1111 Sanders Ned 444-4444 student
123-45-6789 Moore Artie 555-5555 table.

COURSES
Course ID Course Section Day Time
1234 ACCT-3603 1 MWF 8:30
1235 ACCT-3603 2 TR 9:30
1236 MGMT-2103 1 MW 8:30
STUDENT x COURSE
• Even if Ned was the only student in
SCID
the class, ACCT-3603 still exists in
333333333-1234
the course table.
333333333-1236
111111111-1235 • If Ned Sanders drops ACCT-3603,
111111111-1236 remove Ned’s class from this table.
80 of 96
RELATIONAL DATABASES

• There are two basic ways to design well-


structured relational databases.
– Normalization
– Semantic data modeling

81 of 96
RELATIONAL DATABASES

• There are two basic ways to design well-


structured relational databases.
– Normalization
– Semantic data modeling

82 of 96
RELATIONAL DATABASES

• Normalization
– Starts with the assumption that everything is
initially stored in one large table.
– A set of rules is followed to decompose that
initial table into a set of normalized tables.
– Objective is to produce a set of tables in third-
normal form (3NF) because such tables are
free of update, insert, and delete anomalies.

83 of 96
RELATIONAL DATABASES

• There are two basic ways to design well-


structured relational databases.
– Normalization
– Semantic data modeling

84 of 96
RELATIONAL DATABASES

• Semantic data modeling Database


designer uses knowledge about how
business processes typically work and the
information needs associated with
transaction processing to draw a graphical
picture of what should be included in the
database.
– The resulting graphic is used to create a set
of relational tables that are in 3NF.

85 of 96
RELATIONAL DATABASES

• Advantages over simply following


normalization rules:
– Semantic data modeling uses the designer’s
knowledge about business processes and
practices; it therefore facilitates efficient
design of transaction processing databases.
– The resulting graphical model explicitly
represents information about the
organization’s business processes and
policies and facilitates communication with
intended users.

86 of 96
RELATIONAL DATABASES

• Creating relational database queries


– Databases store data for people and
organizations.
– To retrieve the data, you query the database
and its tables.

87 of 96

You might also like