0% found this document useful (0 votes)
4 views7 pages

CSE301 Lec5

The document outlines the objectives and processes involved in physical database design, including defining terms, selecting storage formats, and understanding file organizations. It emphasizes the importance of translating logical data models into efficient physical structures while ensuring data integrity, security, and performance. Key topics include field design, handling missing data, denormalization, file organizations, and the use of indexes for optimizing query performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views7 pages

CSE301 Lec5

The document outlines the objectives and processes involved in physical database design, including defining terms, selecting storage formats, and understanding file organizations. It emphasizes the importance of translating logical data models into efficient physical structures while ensuring data integrity, security, and performance. Key topics include field design, handling missing data, denormalization, file organizations, and the use of indexes for optimizing query performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

OBJECTIVES

 Define terms
CHAPTER 5:
 Describe the physical database design
PHYSICAL DATABASE DESIGN AND
process
PERFORMANCE
 Choose storage formats for attributes
 Select appropriate file organizations
 Describe three types of file organization
 Describe indexes and their appropriate use
 Translate a database model into efficient
structures, and know when/how to
denormalize
1 2

PHYSICAL DATABASE DESIGN PHYSICAL DESIGN PROCESS


Inputs Decisions
 Purpose–translate the logical description Normalized relations
Attribute data types
of data into the technical specifications Volume estimates
Physical record descriptions
for storing and retrieving data Attribute definitions (doesn’t always match
logical design)
 Goal–create a design for storing data that Response time
expectations File organizations
Leads to
will provide adequate performance and Data security needs Indexes and database
insure database integrity, security, and Backup/recovery needs
architectures

recoverability Integrity expectations


Query optimization

DBMS technology used

3 4
DESIGNING FIELDS CHOOSING DATA TYPES

 Field:smallest unit of application data


recognized by system software
 Field design
 Choosing data type
 Coding, compression, encryption
 Controlling data integrity

5 6

Figure 5-1 Example of a code look-up table


(Pine Valley Furniture Company)
FIELD DATA INTEGRITY
 Default value–assumed value if no explicit value
 Range control–allowable value limitations
(constraints or validation rules)
 Null value control–allowing or prohibiting empty
fields
 Referential integrity–range control (and null
Code saves space, but costs
value allowances) for foreign-key to primary-key
an additional lookup to match-ups
obtain actual value

7 8
HANDLING MISSING DATA DENORMALIZATION
 Transforming normalized relations into non-normalized
 Substitute an estimate of the missing value physical record specifications
 Benefits:
(e.g., using a formula)  Can improve performance (speed) by reducing number of table
lookups (i.e. reduce number of necessary join queries)
 Construct a report listing missing values
 Costs (due to data duplication)
 In programs, ignore missing data unless the  Wasted storage space
 Data integrity/consistency threats
value is significant (sensitivity testing)
 Common denormalization opportunities
 One-to-one relationship (Fig. 5-2)
 Many-to-many relationship with non-key attributes (associative entity)
Triggers can be used to perform these operations (Fig. 5-3)
 Reference data (1:N relationship where 1-side has data not used in
any other relationship) (Fig. 5-4)

9 10

Figure 5-3 A possible denormalization situation: a many-to-many


Figure 5-2 A possible denormalization situation: two entities with one- relationship with non-key attributes
to-one relationship

Extra table
access
required

Null description possible

11 12
Figure 5-4
A possible DENORMALIZE WITH CAUTION
denormalization
situation:
reference data  Denormalization can
 Increase chance of errors and inconsistencies
 Reintroduce anomalies
 Force reprogramming when business rules change

Extra table
access
required
 Perhaps other methods could be used to
improve performance of joins
Data duplication
 Organization of tables in the database (file
organization and clustering)
 Proper query design and optimization

13 14

Figure 5-5 DBMS terminology in an Oracle 11g environment


DESIGNING PHYSICAL DATABASE FILES
 Physical File:
A named portion of secondary memory allocated for
the purpose of storing physical records
 Tablespace–named logical storage unit in which data
from multiple tables/views/objects can be stored

 Tablespace components
 Segment – a table, index, or partition
 Extent–contiguous section of disk space
 Data block – smallest unit of storage

15 16
FILE ORGANIZATIONS FILE ORGANIZATIONS
 Technique for physically arranging  Factors for selecting file organization:
records of a file on secondary storage
 Fast data retrieval and throughput
 Types of file organizations  Efficient storage space utilization
 Sequential  Protection from failure and data loss
 Indexed  Minimizing need for reorganization
 Hashed  Accommodating growth
 Security from unauthorized use

17 18

Figure 5-6a
Sequential file INDEXED FILE ORGANIZATIONS
organization  Storage of records sequentially or nonsequentially with
an index that allows software to locate individual
records

Records of the If sorted – every  Index: a table or other data structure used to
insert or delete
file are stored in requires re-sort determine in a file the location of records that satisfy
sequence by the some condition
primary key
field values
If not sorted  Primary keys are automatically indexed
Average time to
find desired record
= n/2  Other fields or combinations of fields can also be
indexed; these are called secondary keys (or
19 nonunique keys) 20
Figure 5-6b Indexed file organization Figure 5-6c
Hashed file
organization

Hash algorithm
Usually uses division-
remainder to determine
record position. Records
with same position are
uses a tree search grouped in lists.
Average time to find desired
record = depth of the tree

21 22

Figure 5-7 Join Indexes–speeds up join operations


a) Join index for common b) Join index for matching foreign
non-key columns key (FK) and primary key (PK)

23 24
USING AND SELECTING KEYS RULES FOR USING INDEXES
 Creating a unique key index 1. Use on larger tables
 Example: CustomerID (primary key) of Customer
2. Index the primary key of each table
 Example: Composite primary key for OrderLine
3. Index search fields (fields frequently in WHERE clause)

4. Fields in SQL ORDER BY and GROUP BY commands


 Creating a secondary key index
5. When there are >100 values but not when there are
 Example: Description field for Product (not unique) <30 values

25 26

RULES FOR USING INDEXES (CONT.) QUERY OPTIMIZATION


6. Avoid use of indexes for fields with long values;  Parallel query processing–possible when
perhaps compress values first working in multiprocessor systems
7. If key to index is used to determine location of record,
use surrogate (like sequence nbr) to allow even spread  Overriding automatic query optimization–allows
in storage area
for query writers to preempt the automated
8. DBMS may have limit on number of indexes per table optimization
and number of bytes per indexed field(s)

9. Be careful of indexing attributes with null values; many  Data warehouses are already configured for
DBMSs will not recognize null values in an index optimized query performance
search

27 28

You might also like