Intro Dbms
Intro Dbms
What is a Database?
A structured collection of related data An filing cabinet, an address book, a telephone directory, a timetable, etc. In Access, your Database is your collection of related tables
A database is a storage space for content / information (data) But what is data? And where is it now?
Data is factual information about objects and concepts, such as:
Measurements, statistics
Data a collection of facts made up of text, numbers and dates: Murray 35000 7/18/86 Information - the meaning given to data in the way it is interpreted:
Mr. Murray is a sales person whose annual salary is $35,000 and wrhose hire date is July 18, 1986. Data -> Field -> Record ->Table
Managing as re-organising
We often need to access and re-sort data for various uses. These may include:
Creating mailing lists Writing management reports Generating lists of selected news stories Identifying various client needs
Managing as re-processing
The processing power of a database allows it to:
Sort Match Link Aggregate Skip fields
Calculate
Arrange
Collection of interrelated data Set of programs to access the data DBMS contains information about a particular enterprise DBMS provides an environment that is both convenient and efficient to use. Database Applications: Banking: all transactions Airlines: reservations, schedules Universities: registration, grades Sales: customers, products, purchases Manufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax deductions Databases touch all aspects of our lives
Multiple file formats, duplication of information in different files Need to write a new program to carry out each new task
Integrity problems
Integrity constraints (e.g. account balance > 0) become part of program code Hard to add new constraints or change existing ones
Failures may leave database in an inconsistent state with partial updates carried out E.g. transfer of funds from one account to another should either complete or not happen at all Concurrent accessed needed for performance Uncontrolled concurrent accesses can lead to inconsistencies E.g. two people reading a balance and updating it at the same time
Security problems
problems
In Short, A Database Management System (DBMS) is a set of computer programs that controls the creation, maintenance, and the use of a database. It allows organizations to place control of database development in the hands of database administrators (DBAs) and other specialists. A DBMS is a system software package that helps the use of integrated collection of data records and files known as databases. It allows different user application programs to easily access the same database. DBMSs may use any of a variety of database models, such as the network model or relational model. In large systems, a DBMS allows users and other software to store and retrieve data in a structured way. Instead of having to write computer programs to extract information, user can ask simple questions in a query language. Thus, many DBMS packages provide Fourth-generation programming language (4GLs) and other application development features. It helps to specify the logical organization for a database and access and use the information within a database. It provides facilities for controlling data access, enforcing data integrity, managing concurrency, and restoring the database from backups. A DBMS also provides the ability to logically present database information to users.
customer) is stored. Logical level: describes data stored in database, and the relationships among the data. type customer = record name : string; street : string; city : integer; end; View level: application programs hide details of data types. Views can also hide information (e.g., salary) for security purposes.
View of Data
An architecture for a database system
What data users and application programs see ? What data is stored ? describe data properties such as data semantics, data relationships How data is actually stored ? e.g. are we using disks ? Which file system ?
DBMS Languages
Data Definition Language (DDL): Used by the DBA and database designers to specify the conceptual schema of a database. In many DBMSs, the DDL is also used to define internal and external schemas (views). In some DBMSs, separate storage definition language (SDL) and view definition language (VDL) are used to define internal and external schemas.
DDL Compiler
Data Manipulation Language (DML): Used to specify database retrievals and updates (insertion, deletion, modifications)
- DML commands (data sublanguage) can be embedded in a generalpurpose programming language (host language).
- Alternatively, stand-alone DML commands can be applied directly (query language).
Also called record-at-a-time (record-oriented) or low-level DML Must be embedded in a programming language. Searches for and retrieves individual database records and uses looping and other constructs of the host programming language to retrieve multip records.
Data Models
A collection of tools for describing data data relationships data semantics data constraints Entity-Relationship model
Relational model
Other models: object-oriented model semi-structured data models Older models: network model and hierarchical model
Entity-Relationship Model
Example of schema in the entity-relationship model
E.g. customers, accounts, bank branch E.g. Account A-101 is held by customer Johnson Relationship set depositor associates customers with accounts
Widely used for database design Database design in E-R model usually converted to design in the relational model (coming up next) which is used for storage and processing
Relational Model
Attributes
accountnumber
A-101 A-215 A-201 A-217 A-201
Palo Alto
Rye Palo Alto Harrison Rye
Smith
proposed for implementing in a database system. One set comprises models of persistent O-O Programming Languages such as C++ (e.g., in OBJECTSTORE or VERSANT), and Smalltalk (e.g., in GEMSTONE). Additionally, systems like O2, ORION (at MCC - then ITASCA), IRIS (at H.P.- used in Open OODB). Object-Relational Models: Most Recent Trend. Started with Informix Universal Server. Exemplified in the latest versions of Oracle-10i, DB2, and SQL Server etc. systems.
Slide 2-23
Hierarchical Model
ADVANTAGES:
Hierarchical Model is simple to construct and operate on Corresponds to a number of natural hierarchically
organized domains - e.g., assemblies in manufacturing, personnel organization in companies Language is simple; uses constructs like GET, GET UNIQUE, GET NEXT, GET NEXT WITHIN PARENT etc.
DISADVANTAGES:
Navigational and procedural nature of processing Database is visualized as a linear arrangement of records Little scope for "query optimization"
Slide 2-24
represents semantics of add/delete on the relationships. Can handle most situations for modeling using record types and relationship types. Language is navigational; uses constructs like FIND, FIND member, FIND owner, FIND NEXT within set, GET etc. Programmers can do optimal navigation through the database.
DISADVANTAGES:
Navigational and procedural nature of processing Database contains a complex array of pointers that thread
Slide 2-25
By adding or removing a record type or data item to expand the database reduce the database Logical Data Independence: The capacity to change the conceptual schema without having to change the external schemas and their application programs. Physical Data Independence: The capacity to change the internal schema without having to change the conceptual schema.
Data Independence
Reorganize physical files to improve performance e.g. List all sections offered in Fall 1998 When a schema at a lower level is changed, only the mappings between this schema and higher-lever schemas need to be changed in a DBMS that fully supports data independence. The higher-level schemas themselves are unchanged. Hence, the application programs need not be changed since they refer to the external schemas.
Disadvantages of two levels of mappings: Overhead during compilation or execution of a query or program
Components of DBMS DBMS Engine accepts logical request from the various other DBMS subsystems, converts them into physical equivalents, and actually accesses the database and data dictionary as they exist on a storage device. Data Definition Subsystem helps user to create and maintain the data dictionary and define the structure of the files in a database. Data Manipulation Subsystem helps user to add, change, and delete information in a database and query it for valuable information. Software tools within the data manipulation subsystem are most often the primary interface between user and the information contained in a database. It allows user to specify its logical information requirements. Application Generation Subsystem contains facilities to help users to develop transaction-intensive applications. It usually requires that user perform a detailed series of tasks to process a transaction. It facilitates easyto-use data entry screens, programming languages, and interfaces. Data Administration Subsystem helps users to manage the overall database environment by providing facilities for backup and recovery, security management, query optimization, concurrency control, and change management.
Components of DBMS
disk storage
Components of a DBMS
Teleprocessing
Traditional architecture.
Single mainframe with a number of terminals
attached.
Trend is now towards downsizing.
Teleprocessing Topology
File-Server
File-server is connected to several workstations
across a network.
workstation.
Disadvantages include:
Significant network traffic. Copy of DBMS on each workstation. Concurrency, recovery and integrity control more complex.
File-Server Architecture
Client-Server
Server holds the database and the DBMS.
Client manages user interface and runs
wider access to existing databases; increased performance; possible reduction in hardware costs; reduction in communication costs; increased consistency.
Client-Server Architecture
and servers in order to provide a consistent environment, particularly for Online Transaction Processing (OLTP).
System Catalog
Repository of information (metadata) describing
dictionary interfaces.
Objectives: extensibility of data; integrity of data; controlled access to data.