0% found this document useful (0 votes)

10 views69 pages

05 Database Design Part I

The document outlines a course on database systems, focusing on concepts such as data models, database management, and GIS applications. It covers various database models, including relational, hierarchical, and object-oriented, as well as the functionality of a Database Management System (DBMS). Additionally, it discusses transaction management, querying, and the importance of data abstraction levels in database design.

Uploaded by

gegis.jkuat2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views69 pages

05 Database Design Part I

Uploaded by

gegis.jkuat2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 69

CONCEPTS IN DATABASE SYSTEMS - I

Monday, October 28, 2024 Felix Mutua, Ph.D.

Course Outline
1. Review of Basic GIS concepts
2. Data models and structure: Spatial data models, topology, format conversions,
vectorization, and rasterization.
3. Database models: Relational, Hierarchical, Network, Object oriented. Data
interchange.
4. Database management: Spatial data; Editing, manipulation, analysis, updating.
5. Data analysis: Spatial analysis; Query, reclassification, overlay, coverage rebuilding,
connectivity analysis.
6. Development and maintenance of a Geographical Information System (GIS)
database.
7. GIS database security.
8. Legal issues in GIS.
9. Applications of GIS.

10/28/2024
Outline for Today’s Lecture
3

 Overview of database systems

 The Relational Data Model
 Database design process
 ER-Modelling
 In part II we will look at:
 SQL

 Spatialdatabases
 Implementation
What Is a Relational Database Management System ?
4

Database Management System = DBMS

Relational DBMS = RDBMS

 A collection of files that store the data

 A big C program written by someone else that accesses and updates

those files for you
Where are RDBMS used ?
5

 Backend for traditional “database” applications

 Backend for large Websites
 Backend for Web services
Example of a Traditional Database Application
6

Suppose we are building a system

to store the information about:
 students

 courses

 professors

 who takes what, who teaches what

Can we do it without a DBMS ?
7

Sure we can! Start by storing the data in files:

students.txt courses.txt professors.txt

Now write C or Java programs to implement specific tasks

Doing it without a DBMS...
8

 Enroll “Mary Johnson” in “CSE444”:

Write a C program to do the following:

Read ‘students.txt’
Read ‘courses.txt’
Find&update the record “Mary Johnson”
Find&update the record “CSE444”
Write “students.txt”
Write “courses.txt”
Problems without an DBMS...
9

 System crashes: Read ‘students.txt’

Read ‘courses.txt’
Find&update the record “Mary Johnson” CRASH !
Find&update the record “CSE444”
Write “students.txt”
Write “courses.txt”

 What is the problem ?

 Large data sets (say 50GB)
 What is the problem ?
 Simultaneous access by many users
 Need locks: we know them from OS, but now data on disk; and
is there any fun to re-implement them ?
Enters a DMBS
10

“Two tier database system”

connection
(ODBC, JDBC)

Database server
(someone else’s
C program) Applications
Data files
Functionality of a DBMS
11

The programmer sees SQL, which has two components:

 Data Definition Language - DDL

 Data Manipulation Language - DML

 query language

Behind the scenes the DBMS has:

 Query optimizer

 Query engine

 Storage management

 Transaction Management (concurrency, recovery)

Functionality of a DBMS
12

Two things to remember:

 Client-server architecture
 Slow, cumbersome connection
 But good for the data

 It is just someone else’s C program

 In the beginning we may be impressed by its speed
 But later we discover that it can be frustratingly slow
 We can do any particular task faster outside the DBMS
 But the DBMS is general and convenient
How the Programmer Sees the DBMS
13

 Start with DDL to create tables:

CREATE TABLE Students (
Name CHAR(30)
SSN CHAR(9) PRIMARY KEY NOT NULL,
Category CHAR(20)
) ...
 Continue with DML to populate tables:

INSERT INTO Students

VALUES(‘Charles’, ‘123456789’, ‘undergraduate’)
. . . .
How the Programmer Sees the DBMS
14

 Tables:
Students: Takes:
SSN Name Category SSN CID
123-45-6789 Charles undergrad 123-45-6789 CSE444
234-56-7890 Dan grad 123-45-6789 CSE444
… … 234-56-7890 CSE142
Courses: …
CID Name Quarter
CSE444 Databases fall
CSE541 Operating systems winter
 Still implemented as files, but behind the scenes can
be quite complex
“data independence” = separate logical view
from physical implementation
Transactions
15

 Enroll “Mary Johnson” in “CSE444”:

BEGIN TRANSACTION;

INSERT INTO Takes

SELECT Students.SSN, Courses.CID
FROM Students, Courses
WHERE Students.name = ‘Mary Johnson’ and
Courses.name = ‘CSE444’

-- More updates here....

IF everything-went-OK
THEN COMMIT;
ELSE ROLLBACK
If system crashes, the transaction is still either committed or aborted
Transactions
16

 A transaction = sequence of statements that either all succeed, or all

fail
 Transactions have the ACID properties:
A = atomicity
C = consistency
I = independence
D = durability
Queries
17

 Find all courses that “Mary” takes

SELECT C.name
FROM Students S, Takes T, Courses C
WHERE S.name=“Mary” and
S.ssn = T.ssn and T.cid = C.cid
 What happens behind the scene ?
 Query processor figures out how to answer the query efficiently.
Queries, behind the scene
18

Declarative SQL query Imperative query execution plan:

sname
SELECT C.name
FROM Students S, Takes T, Courses C
WHERE S.name=“Mary” and
S.ssn = T.ssn and T.cid = C.cid cid=cid

sid=sid

name=“Mary”

Students Takes Courses

The optimizer chooses the best execution plan for a query

Database Systems
19

 The big commercial database vendors:

 Oracle
 IBM (with DB2) bought Informix recently
 Microsoft (SQL Server)
 Sybase
 Some free database systems (Unix) :
 Postgres
 Mysql
 Predator
 In CSE444 we use SQL Server. You may use something else, but you are
on your own.
New Trends in Databases
20

 Object-relational databases
 Main memory database systems
 XML XML XML !
 Relational databases with XML support
 Middleware between XML and relational databases
 Native XML database systems
 Lots of research here at UW on XML and databases
 Data integration
 Peer to peer, stream data management – still research
 NoSQL
 Big Data
Levels of Abstraction
Levels of Abstraction: Internal Level/Schema
 Internal Level/Schema
 The internal schema defines the physical storage structure of the database.
 Facts about Internal schema:
 The internal schema is the lowest level of data abstraction
 It helps you to keeps information about the actual representation of the
entire database. Like the actual storage of the data on the disk in the form
of records
 The internal view tells us what data is stored in the database and how
 It never deals with the physical devices. Instead, internal schema views a
physical device as a collection of physical pages
Levels of Abstraction: Conceptual Schema/Level
 Conceptual Schema/Level
 The conceptual schema describes the Database structure of the whole database
for the community of users.
 This schema hides information about the physical storage structures and focuses on
describing data types, entities, relationships, etc.
 This logical level comes between the user level and physical storage view.
However, there is only single conceptual view of a single database.
 Facts about Conceptual schema:
 Defines all database entities, their attributes, and their relationships
 Security and integrity information
 In the conceptual level, the data available to a user must be contained in or
derivable from the physical level
Levels of Abstraction: External Schema/Level
24

 External Schema/Level
 An external schema describes the part of the database which specific user is interested in. It hides
the unrelated details of the database from the user. There may be "n" number of external views for
each database.
 Each external view is defined using an external schema, which consists of definitions of various
types of external record of that specific view.
 An external view is just the content of the database as it is seen by some specific particular user.
For example, a user from the sales department will see only sales related data.
 Facts about external schema:
 An external level is only related to the data which is viewed by specific end users.
 This level includes some external schemas.
 External schema level is nearest to the user
 The external schema describes the segment of the database which is needed for a certain user
group and hides the remaining details from the database from the specific user group

Felix Mutua, PhD

Goal of 3 level/schema of Database
25

 Every user should be able to access the same data but able to see a
customized view of the data.
 The user need not to deal directly with physical database storage
detail.
 The DBA should be able to change the database storage structure
without disturbing the user's views
 The internal structure of the database should remain unaffected when
changes made to the physical aspects of storage.

Felix Mutua, PhD

Database Models
26

 A database model shows the logical structure of a database, including the

relationships and constraints that determine how data can be stored and
accessed.
 Some of the most common ones include:
 Hierarchical database model
 Relational model
 Network model
 Object-oriented database model
 Entity-relationship model
 Document model
 Entity-attribute-value model
 Star schema
 The object-relational model, which combines the two that make up its name
Felix Mutua, PhD
Database Models
27

 Relational model : The most common model, the relational model

sorts data into tables, also known as relations, each of which consists
of columns and rows. Each row, also called a tuple, includes data
about a specific instance of the entity in question, such as a particular
employee.
 Hierarchical model: The hierarchical model organizes data into a
tree-like structure, where each record has a single parent or root.
Sibling records are sorted in a particular order. That order is used as
the physical order for storing the database.

Felix Mutua, PhD

Database Models
28

 Network model: The network model builds on the hierarchical model

by allowing many-to-many relationships between linked records,
implying multiple parent records. Based on mathematical set theory,
the model is constructed with sets of related records. Each set consists
of one owner or parent record and one or more member or child
records. A record can be a member or child in multiple sets, allowing
this model to convey complex relationships.
 Object-oriented database model: This model defines a database as a
collection of objects, or reusable software elements, with associated
features and methods.

Felix Mutua, PhD

Database Models
29

 Object-relational model: This hybrid database model combines the

simplicity of the relational model with some of the advanced
functionality of the object-oriented database model. In essence, it
allows designers to incorporate objects into the familiar table
structure.
 Entity-relationship model: This model captures the relationships
between real-world entities much like the network model, but it isn’t as
directly tied to the physical structure of the database. Instead, it’s
often used for designing a database conceptually.

Felix Mutua, PhD

Database Models
30

 Object-relational model: This hybrid database model combines the

Felix Mutua, PhD

Database Models
31

 Inverted file model: A database built with the inverted file structure is
designed to facilitate fast full text searches. In this model, data content
is indexed as a series of keys in a lookup table, with the values
pointing to the location of the associated files.
 Flat model: The flat model is the earliest, simplest data model. It
simply lists all the data in a single table, consisting of columns and
rows. In order to access or manipulate the data, the computer has to
read the entire flat file into memory, which makes this model inefficient
for all but the smallest data sets.

Felix Mutua, PhD

Database Models
32

Felix Mutua, PhD

NoSQL database models
33

 The graph database model, which is even more flexible than a

network model, allowing any node to connect with any other.
 The multivalue model, which breaks from the relational model by
allowing attributes to contain a list of data rather than a single data
point.
 The document model, which is designed for storing and managing
documents or semi-structured data, rather than atomic data.

Felix Mutua, PhD

34 The Relational Data Model

Felix Mutua, PhD

What is Relational Model?
35

 Relational Model (RM) represents the database as a collection of relations.

 A relation is nothing but a table of values. Every row in the table represents
a collection of related data values.
 These rows in the table denote a real-world entity or relationship.
 Some popular Relational Database management systems are:
 DB2 and Informix Dynamic Server - IBM
 Oracle and RDB – Oracle
 SQL Server and Access – Microsoft
 Postgres
 MySQL
 ArcGIS Geodatabase

Felix Mutua, PhD

Relational Model Concepts
36

 Attribute: Each column in a Table. Attributes are the properties which

define a relation. e.g., Student_Rollno, NAME,etc.
 Tables – In the Relational model the, relations are saved in the table
format. It is stored along with its entities. A table has two properties
rows and columns. Rows represent records and columns represent
attributes.
 Tuple – It is nothing but a single row of a table, which contains a
single record.
 Relation Schema: A relation schema represents the name of the
relation with its attributes.
Felix Mutua, PhD
Relational Model Concepts
37

 Degree: The total number of attributes which in the relation is called the
degree of the relation.
 Cardinality: Total number of rows present in the Table.
 Column: The column represents the set of values for a specific attribute.
 Relation instance – Relation instance is a finite set of tuples in the RDBMS
system. Relation instances never have duplicate tuples.
 Relation key - Every row has one, two or multiple attributes, which is called
relation key.
 Attribute domain – Every attribute has some pre-defined value and scope
which is known as attribute domain

Felix Mutua, PhD

Relational Model Concepts
38

Felix Mutua, PhD

Relational Integrity Constraints
39

 Relational Integrity constraints in DBMS are referred to conditions

which must be present for a valid relation. These Relational constraints
in DBMS are derived from the rules in the mini-world that the
database represents.
 There are many types of Integrity Constraints in DBMS. Constraints on
the Relational database management system is mostly divided into
three main categories are:
 Domain Constraints
 Key Constraints
 Referential Integrity Constraints

Felix Mutua, PhD

Relational Integrity Constraints
40

 Domain constraints specify that within each tuple, and the value of
each attribute must be unique. This is specified as data types which
include standard data types integers, real numbers, characters,
Booleans, variable length strings, etc.
 Key Constraints: An attribute that can uniquely identify a tuple in a
relation is called the key of the table. The value of the attribute for
different tuples in the relation has to be unique.

Felix Mutua, PhD

Referential Integrity Constraints
41

 Referential Integrity constraints in

DBMS are based on the concept of
Foreign Keys. A foreign key is an
important attribute of a relation
which should be referred to in
other relationships. Referential
integrity constraint state happens
where relation refers to a key
attribute of a different or same
relation. However, that key element
must exist in the table.
Felix Mutua, PhD
Operations in Relational Model
42

 Four basic update operations performed on relational database

model are
 Insert, update, delete and select.
 Insertis used to insert data into the relation
 Delete is used to delete tuples from the table.

 Modify allows you to change the values of some attributes in existing tuples.

 Select allows you to choose a specific range of data.

 Whenever one of these operations are applied, integrity constraints

specified on the relational database schema must never be violated.
Felix Mutua, PhD
43 Felix Mutua, PhD
44 Database Design Process

Felix Mutua, PhD

What is Database Design?
45

 Database Design is a collection of processes that facilitate the

designing, development, implementation and maintenance of
enterprise data management systems.
 Properly designed database are easy to maintain, improves data
consistency and are cost effective in terms of disk storage space. The
database designer decides how the data elements correlate and what
data must be stored.
 The main objectives of database design in DBMS are to produce
logical and physical designs models of the proposed database system.

Felix Mutua, PhD

Why Database Design is Important ?
46

 It helps produce database systems

 That meet the requirements of the users
 Have high performance.

 Database design process in DBMS is crucial for high performance database

system.

Felix Mutua, PhD

Requirements analysis
47

 Planning - This stages of database design concepts are concerned

with planning of entire Database Development Life Cycle. It takes
into consideration the Information Systems strategy of the
organization.
 System definition - This stage defines the scope and boundaries of
the proposed database system.

Felix Mutua, PhD

Database Design
48

 Logical model - This stage is concerned with developing a database

model based on requirements. The entire design is on paper without
any physical implementations or specific DBMS considerations.
 Physical model - This stage implements the logical model of the
database taking into account the DBMS and physical implementation
factors.

Felix Mutua, PhD

Implementation
49

 Data conversion and loading - this stage of relational databases

design is concerned with importing and converting data from the old
system into the new database.
 Testing - this stage is concerned with the identification of errors in the
newly implemented system. It checks the database against
requirement specifications.

Felix Mutua, PhD

Types of Database Techniques
50

 Normalization is a database design technique that reduces data

redundancy and eliminates undesirable characteristics like Insertion,
Update and Deletion Anomalies. Normalization rules divides larger
tables into smaller tables and links them using relationships. The
purpose of Normalisation in SQL is to eliminate redundant (repetitive)
data and ensure data is stored logically.
 Entity Relationship Model (ER Modeling) is a graphical approach to
database design. It is a high-level data model that defines data
elements and their relationship for a specified software system. An ER
model is used to represent real-world objects.

Felix Mutua, PhD

51 Entity Relationship Model (ER Modelling)

Felix Mutua, PhD

What is it?
52

 Entity Relationship Model (ER Modeling) is a graphical approach to

database design. It is a high-level data model that defines data elements
and their relationship for a specified software system. An ER model is used
to represent real-world objects.
 An Entity is a thing or object in real world that is distinguishable from
surrounding environment. For example, each employee of an organization is
a separate entity. Following are some of major characteristics of entities.
 An entity has a set of properties.
 Entity properties can have values.

Felix Mutua, PhD

ER Diagram Symbols and Notations
53

Felix Mutua, PhD

Entity
54

 An entity can be a person, place, event, or object that is relevant to a given

system. For example, a school system may include students, teachers, major
courses, subjects, fees, and other items. Entities are represented in ER
diagrams by a rectangle and named using singular nouns.
 Weak Entity: A weak entity is an entity that depends on the existence of
another entity. In more technical terms it can be defined as an entity that
cannot be identified by its own attributes. It uses a foreign key combined
with its attributed to form the primary key. An entity like order item is a
good example for this. The order item will be meaningless without an order
so it depends on the existence of the order.

Felix Mutua, PhD

Attribute
55

 An attribute is a property, trait,

or characteristic of an entity,
relationship, or another
attribute. For example, the
attribute Inventory Item Name
is an attribute of the entity
Inventory Item.

Felix Mutua, PhD

Multivalued Attribute
56

 If an attribute can have more than one value it is called a multi-valued

attribute. It is important to note that this is different from an attribute
having its own attributes. For example, a teacher entity can have
multiple subject values.

Felix Mutua, PhD

Derived Attribute
57

 An attribute based on another attribute. This is found rarely in ER

diagrams. For example, for a circle, the area can be derived from the
radius.

Felix Mutua, PhD

Relationship
58

 A relationship describes how entities interact. For example, the entity

“Carpenter” may be related to the entity “table” by the relationship
“builds” or “makes”.
 Relationships are represented by diamond shapes and are labeled
using verbs.

Felix Mutua, PhD

Recursive Relationship
59

 If the same entity participates more than once in a relationship it is

known as a recursive relationship. In the below example an employee
can be a supervisor and be supervised, so there is a recursive
relationship.

Felix Mutua, PhD

Cardinality and Ordinality
60

 These two further defines relationships between entities by placing the

relationship in the context of numbers. In an email system, for example,
one account can have multiple contacts. The relationship, in this case,
follows a “one to many” model. There are a number of notations used
to present cardinality in ER diagrams. Chen, UML, Crow’s foot,
Bachman are some of the popular notations

Felix Mutua, PhD

How to Draw ER Diagrams
61

 Identify all the entities in the system. An entity should appear only
once in a particular diagram. Create rectangles for all entities and
name them properly.
 Identify relationships between entities. Connect them using a line and
add a diamond in the middle describing the relationship.
 Add attributes for entities. Give meaningful attribute names so they
can be understood easily.