CSI 11 Tim
CSI 11 Tim
DATABASES
FAL2024
CONTENT
•Introduction
•Database architecture
•Database model
•Database design
2
Objectives
3
CONTENT
•Introduction
•Database architecture
•Database model
•Database design
4
Introduction: Data and Database
Data
• Data is a collection of distinct small units of information.
o Person: name, age, email, dob
o Online purchase: order number, description, order quantity, date, customer’s email
• Data is crucial for both individuals & businesses.
Databases
• [informally] A database is an organized collection of data, so that it can be easily
accessed and managed.
5
Introduction: Flat File Systems
6
Introduction: Flat File Systems (cont.)
Example: In a university, each department might have its own set of files:
• the Record office kept a file about the student information and their grades,
• the Financial Aid office kept its own file about students that needed financial aid
to continue their education,
• the Scheduling office kept the name of the professors and the courses they were
teaching,
• the Payroll department kept its own file about the whole staff (including
professors).
Today, however, all these flat-files can be combined in a single entity, the
database for the whole organisation (i.e., the university as in the previous
example) .
9
Advantages of databases
• Less redundancy
• Inconsistency avoidance
• Efficiency
• Data integrity
• Confidentiality
10
CONTENT
•Introduction
•Database architecture
•Database model
•Database design
11
Database Management Systems (DBMS)
A database management system (DBMS) defines, creates, and maintains a DB. The
DBMS also allows controlled access to data in the database.
Software
• Hardware: the physical computer system that allows access to data.
• Software is the actual program that allows users to access, maintain,
and update data. Hardware
• Data in a DB is stored physically on the storage devices. In a DB, data
is a separate entity from the software that accesses it.
• Procedures refer to general rules and instructions that help to design
the database and to use a DBMS.
Database Access
• Users are (1) the people who control and manage the databases and Data Procedures
Language
perform different types of operations on the databases in
the database management system; and (2) application programs
• Database Access Language is a language used to write commands to User
access, update, and delete data stored in a DB.
Figure 11.1. DBMS components
12
The 3-schema DBMS Architecture
•Introduction
•Database architecture
•Database model
•Database design
15
1. Introduction
16
2. The hierarchical model
• In the hierarchical model, data is organized as an inverted tree. Each entity has only
one parent but can have several children. At the top of the hierarchy, there is one
entity, which is called the root.
• As the hierarchical model is obsolete, no further discussion of this model is necessary
17
3. The network model
• In the network model, the entities are organized in a graph, in which some
entities can be accessed through several paths. There is no hierarchy. This
model is also obsolete and needs no further discussion.
18
4. The relational model
•Introduction
•Database architecture
•Database model
•Database design
20
RELATION
Relation, in appearance, is a two-dimensional table. The RDBMS organizes the data so that its
external view is a set of relations or tables. This does not mean that data are stored as tables: the
physical storage of the data is independent of the way in which the data is logically organized.
• In a relational database we can define several operations to create new relations based on
existing ones. 9 operations in this section: insert, delete, update, select, project, join, union,
intersection, and difference.
• Structured Query Language (SQL) is the language standardized by the American National
Standards Institute (ANSI) and the International Organization for Standardization (ISO) for
use on relational databases.
Expression
UPDATE clause UPDATE country
SET clause SET population = population + 1 Statement
The insert operation is a unary operation — that is, it is applied to a single relation.
The operation inserts a new tuple into the relation. The insert operation uses the
following format:
The delete operation is a unary operation. The operation deletes a tuple defined
by a criterion from the relation. The delete operation uses the following format:
The update operation is a unary operation that is applied to a single relation. The
operation changes the value of some attributes of a tuple. The update operation uses
the following format:
UPDATE RELATION-NAME
SET attribute1 = value1, attribute2 = value2, …
WHERE criteria
Courses
Courses
course_id course_name unit
course_id course_name unit
CIS15 Intro to C 5
CIS15 Intro to C 5
CIS17 Intro to Java 5 UPDATE Courses SET unit=6
CIS17 Intro to Java 5
CIS19 UNIX 4 WHERE course_id='CIS51';
CIS51 Networking 6
CIS51 Networking 5
CIS52 TCP/IP 6
CIS52 TCP/IP 6
The select operation is a unary operation. The tuples (rows) in the resulting relation
are a subset of the tuples in the original relation.
Courses
course_id course_name unit Courses
CIS15 Intro to C 5 course_id course_name unit
CIS17 Intro to Java 5 SELECT * FROM Courses CIS15 Intro to C 5
CIS19 UNIX 4 WHERE unit=5; CIS17 Intro to Java 5
CIS51 Networking 5 CIS51 Networking 5
CIS52 TCP/IP 6
The project operation is a unary operation and creates another relation. The attributes
(columns) in the resulting relation are a subset of the attributes in the original relation. The
project operation creates a relation in which each tuple has fewer attributes.
The join operation is a binary operation that combines two relations on common
attributes.
Courses Taught-by
course_id course_name unit course_id prof
CIS15 Intro to C 5 CIS15 Lee
SELECT course_id, course_name, unit, prof
CIS17 Intro to Java 5 FROM Courses CIS17 Liu
CIS19 UNIX 4 JOIN Taught-by CIS19 Walter
ON Courses.course_id=Taught-by.course_id;
CIS51 Networking 5 CIS51 Liu
CIS52 TCP/IP 6 CIS52 Lee
The union operation takes two relations with the same set of attributes. It creates a
new relation in which each tuple is either in the first relation, in the second, or in both.
SELECT *
FROM RELATION1
UNION
SELECT *
FROM RELATION2
The intersection operation takes two relations with the same set of attributes. It creates
a new relation in which each tuple is a member in both relations.
SELECT *
FROM RELATION1
INTERSECTION
SELECT *
FROM RELATION2
•Introduction
•Database architecture
•Database model
•Database design
31
1. Introduction
The design of any database is a lengthy and involved task that can only be done
through a step-by-step process.
• The first step normally involves a lot of interviewing of potential users of the
database, for example in a university, to collect the information needed to be stored
and the access requirements of each department.
• The second step is to build an entity – relation model (ERM) that defines the entities
for which some information must be maintained, the attributes of these entities,
and the relationship between these entities.
32
2. Entity – relation model (ERM)
In this step, the database designer creates an entity – relationship diagram (ERD) to
show the entities for which information needs to be stored and the relationship
between those entities. E-R diagrams uses several geometric shapes.
36
First normal form (1NF)
E.g., 2 relations, teaches and takes, that are not in first NF. A professor can teach more than one
course, and a student can take more than one course. These two relations can be normalized by
repeating the rows in which this problem exists.
37
Second normal form (2NF)
• In each relation we need to have a key (called a primary key) on which all other attributes
(column values) needs to depend.
• However, it may happen that when relations are established based on the E-R diagram, we
may have some composite keys (a combination of two or more keys). In this case, a relation
is in second normal form if every non-key attribute depends on the whole composite key.
•Introduction
•Database architecture
•Database model
•Database design
39
Building a Data Model (cont.)
40
Building a Data Model (cont.)
belongs to
TRACK
ARTIST - Id (PK)
- id - title
- name belongs to - len
- rating belongs to
- count
belongs to ALBUM - Artist_id (FK)
- id GENRE
- Name - id
- Artist_id (FK) - name
Track Len Artist Album Genre Rating Count
41
Building a Data Model (cont.)
Best practices:
• Never use logical key as the primary key.
• Logical keys can and do change, albeit slowly.
• Relationships that are based on matching string fields are less efficient than integers.
42
Building a Data Model (cont.)
TRACK
ALBUM belongs to - title
- name - len
- rating
- count
Track
Primary key track_id
Logical key title
Album
Foreign key len
album_id
rating
title
count
album_id
43
Building a Data Model (cont.)
Track
track_id
Artist title
Album
artist_id len
album_id
name rating
title
artist_id count
album_id
genre_id
Primary key
Logical key
Genre
Foreign key
genre_id
name
44
Building a Data Model (cont.)
45
6- Guide to practice
set (database in Ms
SQL)
46
1. Install the SQL Server
•
Install the SQL Server Management Studio software. This software
is available for free from Microsoft, and allows you to connect to and
manage your SQL server from a graphical interface instead of having
to use the command line
47
2. Start up SQL Server Management Studio
48
3. Locate the Database folder
49
4. Create a new database
50
5. Create a table
51
6. Create the Primary Key
52
7. Understand how tables are structured
53