Introduction to Database Systems
Introduction to Database Systems
1
Agenda
2
Who am I?
- Profile -
33 years of experience in the Information Technology Industry, including twelve years of experience working
for leading IT consulting firms such as Computer Sciences Corporation
PhD in Computer Science from University of Colorado at Boulder
Past CEO and CTO
Held senior management and technical leadership roles in many large IT Strategy and Modernization
projects for fortune 500 corporations in the insurance, banking, investment banking, pharmaceutical, retail,
and information management industries
Contributed to several high-profile ARPA and NSF research projects
Played an active role as a member of the OMG, ODMG, and X3H2 standards committees and as a
Professor of Computer Science at Columbia initially and New York University since 1997
Proven record of delivering business solutions on time and on budget
Original designer and developer of jcrew.com and the suite of products now known as IBM InfoSphere
DataStage
Creator of the Enterprise Architecture Management Framework (EAMF) and main contributor to the creation
of various maturity assessment methodology
Developed partnerships between several companies and New York University to incubate new
methodologies (e.g., EA maturity assessment methodology), develop proof of concept software, recruit
skilled graduates, and increase the companies’ visibility
3
How to reach me?
Email [email protected]
MSN IM [email protected]
LinkedIn http://www.linkedin.com/in/jcfranchitti
Woo hoo…find the word
of the day… Twitter http://twitter.com/jcfranchitti
Skype [email protected]
4
What is the class about?
Textbooks:
» Fundamentals of Database Systems (7th Edition)
Ramez Elmasri and Shamkant Navathe
Addition Wesley
ISBN-10: 0133970779, ISBN-13: 978-0133970777 - 7th Edition (06/18/15)
5
Icons / Metaphors
Information
Common Realization
Knowledge/Competency Pattern
Governance
Alignment
Solution Approach
66
Course Objectives
7
Key Material Covered (1/2)
8
Key Material Covered (2/2)
Physical design of the database using various file organization and
indexing techniques for efficient query processing
Concurrency Control
Recovery
Query execution
Data warehouses
Additional topics may be covered as time allows, these topics are
covered in greater depth in other courses but PowerPoint presentations
for them will still be provided
The course material is partially derived from the textbook slides and
material covered as part of the Database Systems course offered at
NYU Courant in previous semesters
9
Software Requirements
10
Agenda
11
Section Outline
12
Types of Databases and Database Applications
13
Recent Developments
“Paradigm Shifts”
http://vimeo.com/103246683
https://www.youtube.com/watch?v=wOXWSg_PyTQ
https://www.youtube.com/watch?v=Aesl6HeiwOg
https://www.youtube.com/watch?v=x0iRj8_9KhA&index=2&list=PLBCCA5F25EF30184C
15
Basic Definitions
Database
Collection of related data
Logically coherent collection of data with inherent meaning
Built for a specific purpose
Data
Known facts that can be recorded and have an implicit meaning
Mini-world or Universe of Discourse (UoD)
Represents some aspect of the real world about which data is stored in a database
(e.g., student grades and transcripts at a university)
Example of a large commercial database
Amazon.com
Database management system (DBMS)
» A software package/ system to facilitate the creation and maintenance of a
computerized database
Database system
The DBMS software together with the data itself. Sometimes, the applications are
also included
16
Impact of Databases and Database Technology
Businesses:
» Banking, Insurance, Retail, Transportation, Healthcare,
Manufacturing
Service Industries:
» Financial, Real-estate, Legal, Electronic Commerce,
Small businesses
Education :
» Resources for content and Delivery
More recently:
» Social Networks, Environmental and Scientific
Applications, Medicine and Genetics
Personalized Applications:
» Based on smart mobile devices
17
Simplified Database System Environment
18
Typical DBMS Functionality
19
Application Activities Against a Database
20
Additional DBMS Functionality
21
Example of a Database (with a Conceptual Data Model)
22
Example of a Database (with a Conceptual Data Model)
23
Example of a Simple Database
24
Main Characteristics of the Database Approach
25
Example of a Simplified Database Catalog
26
Main Characteristics of the Database Approach (continued)
Data Abstraction:
» A data model is used to hide storage details and
present the users with a conceptual view of the
database.
» Programs refer to the data model constructs
rather than data storage details
Support of multiple views of the data:
» Each user may see a different view of the
database, which describes only the data of
interest to that user.
27
Main Characteristics of the Database Approach (continued)
28
Database Users
29
Database Users – Actors on the Scene
30
Database End Users
31
Database End Users (continued)
• Sophisticated:
– These include business analysts, scientists, engineers,
others thoroughly familiar with the system capabilities.
– Many use tools in the form of software packages that work
closely with the stored database.
• Stand-alone:
– Mostly maintain personal databases using ready-to-use
packaged applications.
– An example is the user of a tax program that creates its
own internal database.
– Another example is a user that maintains a database of
personal photos and videos.
32
Database Users – Actors on the Scene (continued)
33
Database Users – Actors Behind the Scene
34
Advantaged of Using the Database Approach
35
Advantaged of Using the Database Approach (continued)
36
Additional Implications of Using the Database Approach
37
Additional Implications of Using the Database Approach (continued)
39
Historical Development of Database Technology (continued)
40
Historical Development of Database Technology (continued)
41
Extending Database Capabilities
43
Extending Database Capabilities (continued)
» NOSQL (Not Only SQL- where SQL is the de facto standard language for
relational DBMSs) systems have been designed for rapid search and
retrieval from documents, processing of huge graphs occurring on social
networks, and other forms of unstructured data with flexible models of
transaction processing.
44
When Not to Use a DBMS
45
When Not to Use a DBMS (continued)
46
Summary
47
Agenda
48
Section Outline
49
Data Models
Data Model:
» A set of concepts to describe the structure of a database, the
operations for manipulating these structures, and certain
constraints that the database should obey.
Data Model Structure and Constraints:
» Constructs are used to define the database structure
» Constructs typically include elements (and their data types) as
well as groups of elements (e.g. entity, record, table), and
relationships among such groups
» Constraints specify some restrictions on valid data; these
constraints must be enforced at all times
Data Model Operations:
» These operations are used for specifying database retrievals and updates
by referring to the constructs of the data model.
» Operations on the data model may include basic model operations (e.g.
generic insert, delete, update) and user-defined operations (e.g.
compute_student_gpa, update_inventory)
50
Categories of Data Models
Database Schema:
» The description of a database.
» Includes descriptions of the database structure, data types, and
the constraints on the database.
Schema Diagram:
» An illustrative display of (most aspects of) a database schema.
Schema Construct:
» A component of the schema or an object within the schema, e.g.,
STUDENT, COURSE.
Database State:
» The actual data stored in a database at a particular moment in
time. This includes the collection of all the data in the database.
» Also called database instance (or occurrence or snapshot).
• The term instance is also applied to individual database components,
e.g. record instance, table instance, entity instance
52
Database Schema vs. Database State
Database State:
» Refers to the content of a database at a moment in time.
Initial Database State:
» Refers to the database state when it is initially loaded into the
system.
Valid State:
» A state that satisfies the structure and constraints of the database
Distinction
» The database schema changes very infrequently.
» The database state changes every time the database is updated.
53
Example of a Database Schema
54
Example of a Database State
55
Three-Schema Architecture
56
The Three-Schema Architecture
57
Three-Schema Architecture
58
Data Independence
59
DBMS Languages
60
DBMS Languages (continued)
61
Types of DML
62
DBMS Interfaces
63
DBMS Programming Language Interfaces
64
User-Friendly DBMS Interfaces
65
Other DBMS Interfaces
66
Database Systems Utilities
67
Other Tools
68
The Database System Environment
69
Typical DBMS Component Modules
70
Centralized and Client-Server DBMS Architectures
Centralized DBMS:
» Combines everything into single system
including- DBMS software, hardware,
application programs, and user interface
processing software.
» User can still connect through a remote
terminal – however, all processing is done at
centralized site.
71
A Physical Centralized DBMS Architecture
72
Basic 2-tier Client / Server Architectures
73
Logical and Physical Two-Tier Client / Server Architecture
74
DBMS Clients
75
DBMS Server
76
Two Tier Client-Server Architecture
77
Three Tier Client-Server Architecture
79
Classification of Database Management Systems
80
Variations of Distributed DBMSs (DDBMSs)
Homogeneous DDBMS
Heterogeneous DDBMS
Federated or Multidatabase Systems
» Participating Databases are loosely coupled
with high degree of autonomy.
Distributed Database Systems have now
come to be known as client-server based
database systems because:
» They do not support a totally distributed
environment, but rather a set of database
servers supporting a set of clients.
81
Cost Considerations for DBMSs
82
Other Considerations
83
History of Data Models (Additional Material)
Network Model
Hierarchical Model
Relational Model
Object-oriented Data Models
Object-Relational Models
84
History of Data Models – Network Model
Network Model:
» The first network DBMS was implemented by
Honeywell in 1964-65 (IDS System).
» Adopted heavily due to the support by
CODASYL (Conference on Data Systems
Languages) (CODASYL - DBTG report of
1971).
» Later implemented in a large variety of systems
- IDMS (Cullinet - now Computer Associates),
DMS 1100 (Unisys), IMAGE (H.P. (Hewlett-
Packard)), VAX -DBMS (Digital Equipment
Corp., next COMPAQ, now H.P.).
85
History of Data Models – Network Model
Advantages:
» Network Model is able to model complex relationships
and represents semantics of add/delete on the
relationships.
» Can handle most situations for modeling using record
types and relationship types.
» Language is navigational; uses constructs like FIND,
FIND member, FIND owner, FIND NEXT within set,
GET, etc.
• Programmers can do optimal navigation through the database.
Disadvantages:
» Navigational and procedural nature of processing
» Database contains a complex array of pointers that
thread through a set of records.
• Little scope for automated “query optimization”
86
History of Data Models – Network Model
87
History of Data Models – Hierarchical Model
88
History of Data Models – Hierarchical Model
Advantages:
» Simple to construct and operate
» Corresponds to a number of natural hierarchically organized
domains, e.g., organization (“org”) chart
» Language is simple:
• Uses constructs like GET, GET UNIQUE, GET NEXT, GET
NEXT WITHIN PARENT, etc.
Disadvantages:
» Navigational and procedural nature of processing
» Database is visualized as a linear arrangement of records
» Little scope for "query optimization"
89
History of Data Models – Relational Model
Relational Model:
» Proposed in 1970 by E.F. Codd (IBM), first commercial
system in 1981-82.
» Now in several commercial products (e.g. DB2, ORACLE,
MS SQL Server, SYBASE, INFORMIX).
» Several free open source implementations, e.g. MySQL,
PostgreSQL
» Currently most dominant for developing database
applications.
» SQL relational standards: SQL-89 (SQL1), SQL-92 (SQL2),
SQL-99, SQL3, …
90
History of Data Models – Object-Oriented Data Models
91
History of Data Models – Object-Relational Models
Object-Relational Models:
» The trend to mix object models with relational was
started with Informix Universal Server.
» Relational systems incorporated concepts from object
databases leading to object-relational.
» Exemplified in the versions of Oracle, DB2, and SQL
Server and other DBMSs.
» Current trend by Relational DBMS vendors is to extend
relational DBMSs with capability to process XML, Text
and other data types.
» The term “Object-relational” is receding in the
marketplace.
92
Section Summary
93
Agenda
94
Course Assignments
Individual Assignments
Reports based on case studies / class presentations
Textbook problem sets
Project-Related Assignments
All assignments (other than the individual assessments) will
correspond to milestones in the course project
95
Assignments & Readings
Readings
» Slides and Handouts posted on the course web site
» Textbook: Chapters 1 & 2
96
Next Session: Relational Data Model and Relational Database Constraints
97
Any Questions?
98