Lectures 2014 Handout1 Introduction
Lectures 2014 Handout1 Introduction
References
• Ramez Elmasri and Shamkant B. Navathe, Fundamentals of Database Systems, 5th Edition, 2007
• Raghu Ramakrishnan and Johannes Gehrke, Database Management Systems, 3rd Edition, McGraw-
Hill, 2004
Topics
• Introduction
– Information models and systems, Database system evolution, File based systems, DBMS ap-
proach, Database environment and components, DBMS functions, DBMS architecture, Data
independence
• Data modeling
– Motivation, Roll in system development, Levels of abstraction and practice, Conceptual mod-
els; ER and UML, Logical models; Relational and OO models
• RDBMS Concepts
– Relational algebra and relational calculus, Relational integrity, Normalization, Object oriented
extensions
• Database Query Languages
– 4GL environments, SQL, DDL, DML and DCL, Query optimization, Rule based and cost
based approaches, Embedded SQL
• Transaction Processing
– Transactions, Concurrency control, Serialization, Failure and recovery
• Distributed Databases
– Data fragmentation, Replication and allocation, Distributed query processing, Distributed
transaction model, Concurrency control, Homogeneous and heterogeneous environments
• Physical Database Design
– Storage and file structures, indexed, hashed and signature files, B-trees, Sparse and dense
indexes, Variable length records, Database tuning
1
1 Introduction to Database Systems
1.1 Definitions
Data
Refers to known facts that can be recorded and have an implicit meaning.
Database
A collection of related data with the following properties
• Intended application and users, i.e., specific purpose
• Represents some aspects of the real world
• Logically organized
Database System
The DBMS software together with the data itself. Sometimes, the applications are also included.
• Define a database
– in terms of data types, structures and constraints
• Construct or Load the Database on a secondary storage medium
• Other features:
2
1.3 Database Environment
A simplified database system environment
1.4 Example
Example: UNIVERSITY database
data elements
Name, StudentNumber, Class, Major
data type
STUDENT (Name : string, StudentNumber : integer) GRADE REPORT (Grade: single character)
3
Figure 2: Example: UNIVERSITY database
• Redundant data
• Wasted storage space
• Inconsistent data
• Difficult to add/modify applications
• File structure is part of the code
4
Main Characteristics of the Database Approach
• Data Abstraction
– A data model is used to hide storage details and present the users with a conceptual view of
the database.
Database Users
• Database administrators
– responsible for authorizing access to the database, for coordinating and monitoring its use,
acquiring software, and hardware resources, controlling its use and monitoring efficiency of
operations.
• Database Designers
– responsible to define the content, the structure, the constraints, and functions or transactions
against the database. They must communicate with the end-users and understand their needs.
• End-users
– they use the data for queries, reports and some of them actually update the database content.
• System Analysts and Application programmers (Software Engineers)
– Design and implement canned transactions for parametric users.
5
Categories of End-users
• Casual Users
– who access database occasionally when needed
• Naı̈ve or Parametric
– they make up a large section of the end-user population. They use previously well-defined
functions in the form of canned transactions against the database.
∗ Examples are bank-tellers or reservation clerks who do this activity for an entire shift of
operations.
• Sophisticated user
– these include business analysts, scientists, engineers, others thoroughly familiar with the sys-
tem capabilities. Many use tools in the form of software packages that work closely with the
stored database.
• Stand-alone users
– mostly maintain personal databases using ready-to-use packaged applications. An example is
a tax program user that creates his or her own internal database.
• Tool developers
– They design and implement tools that facilitate the use of the DBMS software. Tools include
design tools, performance tools, special interfaces, etc.
• Operators and maintenance personnel
– They work on running and maintaining the hardware and software environment for the
database system.
6
Implications of Using the Database Approach
• Potential for enforcing standards : this is very crucial for the success of database applications in
large organizations. Standards refer to data item names, display formats, screens, report structures,
meta-data etc.
• Reduced application development time : incremental time to add each new application is reduced.
• Flexibility to change data structures : database structure may evolve as new requirements are
defined.
• Availability of up-to-date information : very important for on-line transaction systems such as
airline, hotel, car reservations.
• Economies of scale : by consolidating data and applications across departments wasteful overlap
of resources and personnel can be avoided.
• Object-oriented applications
– OODB’s were introduced in late 1980’s and early 1990’s to cater to the need of complex data
processing needs arised with the emergence of OO programming languages.
– They are mainly used in appplications such as engineering design, multimedia publishing and
manufacturing systems.
• Data on the Web and E-commerce Applications
– Web contains data in HTML (Hypertext markup language) with links among pages.
– This has given rise to a new set of applications and E-commerce is using new standards like
XML (eXtendedMarkup Language).
7
When not to use a DBMS
Data Model
A set of concepts to describe the structure of a database, and certain constraints that the database
should obey.
• Database Schema : The description of a database. Includes descriptions of the database structure
and the constraints that should hold on the database.
8
Figure 3: The three-schema architecture
Mappings among schema levels are needed to transform requests and data. Programs refer to an
external schema, and are mapped by the DBMS to the internal schema for execution.
9
1.9 Data Independence
• Data Definition Language (DDL): Used to specify the conceptual schema of a database. In many
DBMSs, the DDL is also used to define internal and external schemas (views).
• Separate storage definition language (SDL): Used to specify the internal schma.
• View definition language (VDL): Used to define internal and external schemas.
• Data Manipulation Language (DML): Used to specify database retrievals and updates.
– DML commands (data sublanguage) can be embedded in a general-purpose programming
language (host language), such as COBOL, C or an Assembly Language.
– Alternatively, stand-alone DML commands can be applied directly (query language).
• High Level or Non-procedural Languages : e.g., SQL, are set-oriented and specify what data to
retrieve than how to retrieve. Also called declarative languages.
• Low Level or Procedural Languages : record-at-a-time; they specify how to retrieve data and include
constructs such as looping.
DBMS Interfaces
• Menu-based Interfaces
• Form-based Interfaces
10
Figure 4: Component modules of a DBMS and their interactions
DBMS Components
– Loading data stored in files into a database. Includes data conversion tools.
– Backing up the database periodically on tape.
– Reorganizing database file structures.
– Report generation utilities.
– Performance monitoring utilities.
– Other functions, such as sorting, user monitoring, data compression, etc.
11
Other Tools
• Based on cost
– Open source (free) vs. Commercial
File: Lectures 2014.tex Date: Friday 24th January, 2014 3:31pm Revision: 0.3
12