Basics of Database System
Basics of Database System
Definition of Terms
Database: o A logically coherent collection of related data that (i) describes the entities and their inter-relationships, and (ii) is designed, built & populated for a specific reason
Database Management System (DBMS): o A collection of programs that enables users to perform certain actions on a particular database: ! define the structure of database information (descriptive attributes, data types, constraints, etc), storing this as metadata populate the database with appropriate information manipulate the database (for retrieval/update/removal/insertion of information) protect the database contents against accidental or deliberate corruption of contents (involves secure access by users and automatic recovery in the case of user/hardware faults) share the database among multiple users, possibly concurrently
! ! !
[Example DBMS would include Oracle, Sybase, MySQL, DB/2, SQLServer, Informix, MS-Access, FileMaker, ]
Sample Databases
Shown below is an extract from a (relational) database that might be part of a Universitys Academic Information System:
1. Shown below is an extract from a (relational) database that might part of a Universitys Academic Information System:
Shown below is an extract from a (relational) database that might be part of a Consultancy Business Information System:
Terminology: relation = table (file) attribute = column (field) tuple = row (record)
Concepts: Domains: o an attribute will always be chosen from some domain of valid values (e.g. SSN will be chosen from the valid 9-digit collection of Social Security Nos., GPA will be chosen from the 1.2-digit set of values) Null Values: o some attributes may be allowed to assume null (unspecified) values typically where an attribute may be inapplicable or just unknown at the point of data entry Row Ordering: o the order of tuples within a relation is unimportant [in practice, it may be, since it can determine the order of returned data rows
Column Ordering: o the order of attributes within a relation is unimportant [in practice, it may be, since it can determine the order of returned data columns]
Key: o every relation should have an attribute (possibly composite) that behaves as a key it is unique in value to each tuple, and can
therefore act as an identifier for a particular tuple; e.g., in the STUDENT table immediately above, SSN acts as the key attribute; in the GRADE_REPORT relation seen earlier, the key is the composite attribute (StudentNumber, SectionIdentifier) o note that combining the key with any collection of other attributes also guarantees uniqueness e.g., in the STUDENT table, the combination (SSN, Name), or any other combination involving SSN, can act as identifier; we call these superkeys and define a key as the minimal collection of attributes that guarantees uniqueness (the minimal superkey) o it may sometimes occur that we have more than one choice of key attribute within a relation e.g., in the COMPANY database seen at the start of this lecture, we may have knowledge that both department numbers (DNUMBER) and names (DNAME) are unique; we declare these to be candidate keys and generally choose one to be the primary key (DNUMBER, say) and the others to be alternate keys
Key Integrity (Entity Constraints): o no component of a candidate key should be allowed to accept null values; no replicate key values should be allowed
Foreign Key: o if a relation R1( , X, ) contains an attribute X that acts as a key to another relation R2( , X, ), then we say that R1 references R2; attribute X is said to be a foreign key within R1; we can depict all such references & foreign keys within a database by incorporating them into the schema, as outlined below for the COMPANY database:
Referential Integrity Constraints: o if a relation R1( , X, ) references another relation R2( , X, ) on attribute X, then every value of X appearing in R1 must also appear in R2
Database Users
Database Administrators (DBA): o individual(s) that determine & implement policy regarding users, their permissions on a database and the design & construction of that database Database Designers: o individual(s) possibly also software engineers who apply design techniques to produce database structures pertinent to a specific application End Users: o People who, from time to time, access the contents of a database: ! ! Casual end users may submit ad-hoc queries as the need arises, using a high-level query language nave, or parametric, end-users access the database through pre-written programs that effect an appropriate interface to the database database programmers write code, using a relevant programming language and the high-level query language, that can later be used by parametric users
o the data control language (DCL) comprises those instructions used for specifying access permissions on the database structures & contents Structured Query Language (SQL), originally specified as part of the System R project, is a DSL that is common to many implementations of the relational model: Oracle, Sybase, SQLServer, MySQL, DB/2, beyond this command-oriented interaction, many users may benefit from having more user-friendly interfaces partly or wholly graphical in nature; such facilities might be provided as part of DBMS, might be coded externally by programmers or might be supplied by third-party add-on providers; interface types might include: o Menu-based interfaces for Web clients or browsing o Forms-based interfaces o Graphical user interfaces o Natural language interfaces o Interfaces for parametric users o Interfaces for the DBA
the DBMS uses the OS features as does any other application: for memory & CPU allocation, for I/O to/from display & keyboard and particularly for storage of data on disk
Database Architecture
the collection of programs that comprise a DBMS can be organized into a set of component modules, each of which carry out a specific task:
[cornered rectangles represent DBMS component modules; rounded rectangles represent user commands or programs; all activities and communications within dotted rectangle are under control of Stored Data Manager] these necessary components might be augmented by utility programs that provide useful services: o for loading of data from data files or other storage formats o for backup of database contents totally or incrementally onto a save storage medium (tape, say) o for reorganization of stored data to enhance space/time efficiency
o for performance monitoring of DBMS by DBA so as to optimize response time, throughput, etc. also, for application program development, internal/external software systems might be provided with connectivity to the DBMS (application development environments) such as Jbuilder from Borland or PowerBuilder from Sybase finally, many business enterprises may have several disjoint databases (and even DBMS), may have applications using files only and may have purely manual processing of some data; in order to keep track of all data used (data item names, aliases, data types, inter-relationships, etc), a software data dictionary system might be employed; it is useful if there are automatic import/export facilities between the database system catalog and the wider data dictionary