Introduction To Database Systems
Introduction To Database Systems
Introduction to Database
Systems
Hans-Petter Halvorsen, 2016.11.01
http://home.hit.no/~hansha
Preface
This document explains the basic concepts of a database system and how to communicate
with a database system.
The main focus in this document is on relational databases and Microsoft SQL Server.
Table of Contents
Preface ...................................................................................................................................... 2
iii
iv Table of Contents
1 Database Systems
A database is an integrated collection of logically related records or files consolidated into a
common pool that provides data for one or more multiple uses.
One way of classifying databases involves the type of content, for example: bibliographic,
full-text, numeric, and image. Other classification methods start from examining database
models or database architectures.
The data in a database is organized according to a database model. The relational model is
the most common.
A Database Management System (DBMS) consists of software that organizes the storage of
data. A DBMS controls the creation, maintenance, and use of the database storage
structures of organizations and of their end users. It allows organizations to place control of
organization-wide database development in the hands of Database Administrators (DBAs)
and other specialists. In large systems, a DBMS allows users and other software to store and
retrieve data in a structured way.
Database management systems are usually categorized according to the database model
that they support, such as the network, relational or object model. The model tends to
determine the query languages that are available to access the database. One commonly
used query language for the relational database is SQL, although SQL syntax and function
can vary from one DBMS to another. A great deal of the internal engineering of a DBMS is
independent of the data model, and is concerned with managing factors such as
performance, concurrency, integrity, and recovery from hardware failures. In these areas
there are large differences between products.
2 Database Systems
SQL engine - This component interprets and executes the SQL query. It comprises
three major components (compiler, optimizer, and execution engine).
Transaction engine - Transactions are sequences of operations that read or write
database elements, which are grouped together.
Relational engine - Relational objects such as Table, Index, and Referential integrity
constraints are implemented in this component.
Storage engine - This component stores and retrieves data records. It also provides a
mechanism to store metadata and control information such as undo logs, redo logs,
lock tables, etc.
For example, a data set containing all the real-estate transactions in a town can be grouped
by the year the transaction occurred; or it can be grouped by the sale price of the
transaction; or it can be grouped by the buyer's last name; and so on.
Such a grouping uses the relational model (a technical term for this is schema). Hence, such
a database is called a "relational database."
The software used to do this grouping is called a relational database management system.
The term "relational database" often refers to this type of software.
Relational databases are currently the predominant choice in storing financial records,
manufacturing and logistical information, personnel data and much more.
A real-time database is a processing system designed to handle workloads whose state may
change constantly. This differs from traditional databases containing persistent data, mostly
unaffected by time. For example, a stock market changes rapidly and dynamically. Real-time
processing means that a transaction is processed fast enough for the result to come back
and be acted on right away. Real-time databases are useful for accounting, banking, law,
medical records, multi-media, process control, reservation systems, and scientific data
analysis. As computers increase in power and can store more data, real-time databases
become integrated into society and are employed in many applications
This document will focus on Microsoft Access and Microsoft SQL Server.
1.7 MDAC
The Microsoft Data Access Components (MDAC) is the framework that makes it possible to
connect and communicate with the database. MDAC includes the following components:
MDAC also installs several data providers you can use to open a connection to a specific data
source, such as an MS Access database.
1.7.1 ODBC
Open Database Connectivity (ODBC) is a native interface that is accessed through a
programming language that can make calls into a native library. In MDAC this interface is
defined as a DLL. A separate module or driver is needed for each database that must be
accessed.
1.7.2 OLE DB
OLE allows MDAC applications access to different types of data stores in a uniform manner.
Microsoft has used this technology to separate the application from the data store that it
needs to access. This was done because different applications need access to different types
and sources of data, and do not necessarily need to know how to access technology-specific
functionality. The technology is conceptually divided into consumers and providers. The
consumers are the applications that need access to the data, and the provider is the
software component that exposes an OLE DB interface through the use of the Component
Object Model (or COM).
2 Relational Databases
A relational database matches data using common characteristics found within the data set.
The resulting groups of data are organized and are much easier for people to understand.
For example, a data set containing all the real-estate transactions in a town can be grouped
by the year the transaction occurred; or it can be grouped by the sale price of the
transaction; or it can be grouped by the buyer's last name; and so on.
Such a grouping uses the relational model (a technical term for this is schema). Hence, such
a database is called a "relational database."
The software used to do this grouping is called a relational database management system.
The term "relational database" often refers to this type of software.
Relational databases are currently the predominant choice in storing financial records,
manufacturing and logistical information, personnel data and much more.
2.1 Tables
The basic units in a database are tables and the relationship between them. Strictly, a
relational database is a collection of relations (frequently called tables).
Below we see how a relationship between two tables are defined using Primary Keys and
Foreign Keys.
6 Relational Databases
A unique key must uniquely identify all possible rows that exist in a table and not only the
currently existing rows. Examples of unique keys are Social Security numbers or ISBNs.
A primary key is a special case of unique keys. The major difference is that for unique keys
the implicit NOT NULL constraint is not automatically enforced, while for primary keys it is
enforced. Thus, the values in unique key columns may or may not be NULL. Another
difference is that primary keys must be defined using another syntax.
If the primary key consists only of a single column, the column can be marked as such using
the following syntax:
CREATE TABLE table_name
(
id_col INT PRIMARY KEY,
col2 CHARACTER VARYING(20),
...
)
Likewise, unique keys can be defined as part of the CREATE TABLE SQL statement.
CREATE TABLE table_name
(
id_col INT,
col2 CHARACTER VARYING(20),
key_col SMALLINT,
...
CONSTRAINT key_unique UNIQUE(key_col),
...
)
Or if the unique key consists only of a single column, the column can be marked as such
using the following syntax:
CREATE TABLE table_name
(
The referencing and referenced table may be the same table, i.e. the foreign key refers back
to the same table. Such a foreign key is known as self-referencing or recursive foreign key.
A table may have multiple foreign keys, and each foreign key can have a different referenced
table. Each foreign key is enforced independently by the database system. Therefore,
cascading relationships between tables can be established using foreign keys.
Improper foreign key/primary key relationships or not enforcing those relationships are
often the source of many database and data modeling problems.
Foreign keys can be defined as part of the CREATE TABLE SQL statement.
CREATE TABLE table_name
(
id INTEGER PRIMARY KEY,
col2 CHARACTER VARYING(20),
col3 INTEGER,
...
CONSTRAINT col3_fk FOREIGN KEY(col3)
REFERENCES other_table(key_col),
...
)
If the foreign key is a single column only, the column can be marked as such using the
following syntax:
CREATE TABLE table_name
(
id INTEGER PRIMARY KEY,
col2 CHARACTER VARYING(20),
col3 INTEGER REFERENCES other_table(column_name),
...
)
2.4 Views
In database theory, a view consists of a stored query accessible as a virtual table composed
of the result set of a query. Unlike ordinary tables in a relational database, a view does not
form part of the physical schema: it is a dynamic, virtual table computed or collated from
data in the database. Changing the data in a table alters the data shown in subsequent
invocations of the view.
Syntax:
CREATE VIEW <ViewName>
AS
2.5 Functions
In SQL databases, a user-defined function provides a mechanism for extending the
functionality of the database server by adding a function that can be evaluated in SQL
statements. The SQL standard distinguishes between scalar and table functions. A scalar
function returns only a single value (or NULL), whereas a table function returns a (relational)
table comprising zero or more rows, each row with one or more columns.
User-defined functions in SQL are declared using the CREATE FUNCTION statement.
Syntax:
CREATE FUNCTION <FunctionName>
(@Parameter1 <datatype>,
@ Parameter2 <datatype>,
)
RETURNS <datatype>
AS
Stored procedures are not part of the relational database model, but all commercial
implementations include them.
or
EXECUTE procedure()
Stored procedures can return result sets, i.e. the results of a SELECT statement. Such result
sets can be processed using cursors by other stored procedures by associating a result set
locator, or by applications. Stored procedures may also contain declared variables for
processing data and cursors that allow it to loop through multiple rows in a table. The
standard Structured Query Language provides IF, WHILE, LOOP, REPEAT, CASE statements,
and more. Stored procedures can receive variables, return results or modify variables and
return them, depending on how and where the variable is declared.
2.7 Triggers
A database trigger is procedural code that is automatically executed in response to certain
events on a particular table or view in a database. The trigger is mostly used for keeping the
integrity of the information on the database. For example, when a new record (representing
a new worker) added to the employees table, new records should be created also in the
tables of the taxes, vacations, and salaries.
This document gives only a very brief overview of SQL, for more in-depth overview of SQL.
Please refer to the tutorial Structured Query Language located on my web site:
http://home.hit.no/~hansha/?tutorial=sql
3.1 Queries
The most common operation in SQL is the query, which is performed with the declarative
SELECT statement. SELECT retrieves data from one or more tables, or expressions. Standard
SELECT statements have no persistent effects on the database.
Queries allow the user to describe desired data, leaving the database management system
(DBMS) responsible for planning, optimizing, and performing the physical operations
necessary to produce that result as it chooses.
A query includes a list of columns to be included in the final result immediately following the
SELECT keyword. An asterisk ("*") can also be used to specify that the query should return all
columns of the queried tables. SELECT is the most complex statement in SQL, with optional
keywords and clauses that include:
The FROM clause which indicates the table(s) from which data is to be retrieved. The
FROM clause can include optional JOIN subclauses to specify the rules for joining
tables.
The WHERE clause includes a comparison predicate, which restricts the rows
returned by the query. The WHERE clause eliminates all rows from the result set for
which the comparison predicate does not evaluate to True.
The GROUP BY clause is used to project rows having common values into a smaller
set of rows. GROUP BY is often used in conjunction with SQL aggregation functions or
10
11 Structured Query Language (SQL)
to eliminate duplicate rows from a result set. The WHERE clause is applied before the
GROUP BY clause.
The HAVING clause includes a predicate used to filter rows resulting from the GROUP
BY clause. Because it acts on the results of the GROUP BY clause, aggregation
functions can be used in the HAVING clause predicate.
The ORDER BY clause identifies which columns are used to sort the resulting data,
and in which direction they should be sorted (options are ascending or descending).
Without an ORDER BY clause, the order of rows returned by an SQL query is
undefined.
Example:
The following is an example of a SELECT query that returns a list of expensive books. The
query retrieves all rows from the Book table in which the price column contains a value
greater than 100.00. The result is sorted in ascending order by title. The asterisk (*) in the
select list indicates that all columns of the Book table should be included in the result set.
SELECT *
FROM Book
WHERE price > 100.00
ORDER BY title;
The example below demonstrates a query of multiple tables, grouping, and aggregation, by
returning a list of books and the number of authors associated with each book.
SELECT Book.title,count(*) AS Authors
FROM Book
JOIN Book_author ON Book.isbn = Book_author.isbn
GROUP BY Book.title
[End of Example]
The acronym CRUD refers to all of the major functions that need to be implemented in a
relational database application to consider it complete. Each letter in the acronym can be
mapped to a standard SQL statement:
Operation SQL
Create INSERT
Update UPDATE
Example:
INSERT:
UPDATE:
DELETE:
[End of Example]
Example:
CREATE:
[End of Example]
3.4.3 Numbers
INTEGER and SMALLINT
FLOAT, REAL and DOUBLE PRECISION
NUMERIC(precision, scale) or DECIMAL(precision, scale)
4 Database Modelling
4.1 ER Diagram
In software engineering, an Entity-Relationship Model (ERM) is an abstract and conceptual
representation of data. Entity-relationship modeling is a database modeling method, used to
produce a type of conceptual schema or semantic data model of a system, often a relational
database, and its requirements in a top-down fashion.
Diagrams created using this process are called entity-relationship diagrams, or ER diagrams
or ERDs for short.
There are many ER diagramming tools. Some of the proprietary ER diagramming tools are
ERwin, Enterprise Architect and Microsoft Visio.
Microsoft SQL Server has also a built-in tool for creating Database Diagrams.
14
15 Database Modelling
In the Database menu Visio offers lots of functionality regarding your database model.
Reverse Engineering is the opposite procedure, i.e., extraction of a database schema from
an existing database into a database model in Microsoft Visio.
CUSTOMER
o CustomerId (PK)
o FirstName
o LastName
o Address
o Phone
o PostCode
o PostAddress
PRODUCT
o ProductId (PK)
o ProductName
o ProductDescription
o Price
o ProductCode
ORDER
o OrderId (PK)
o OrderNumber
o OrderDescription
o CustomerId (FK)
ORDER_DETAIL
o OrderDetailId (PK)
o OrderId (FK)
o ProductId (FK)
[End of Example]
4.3 ERwin
ERwin is a professional database modelling tool. A Community edition is also available for
free. The Community edition is limited to work with max 25 objects.
With Erwin and other professional database modelling tools you can directly import the
database model into the database system such as SQL Server, MySQL, etc.
However, the Express edition has a number of technical restrictions which make it
undesirable for large-scale deployments, including:
Maximum database size of 4 GB per. The 4 GB limit applies per database (log files
excluded); but in some scenarios users can access more data through the use of
multiple interconnected databases.
Single physical CPU, multiple cores
1 GB of RAM (runs on any size RAM system, but uses only 1 GB)
SQL Server Express offers a GUI tools for database management in a separate download and
installation package, called SQL Server Management Studio Express.
19
20 Microsoft SQL Server
A central feature of SQL Server Management Studio is the Object Explorer, which allows the
user to browse, select, and act upon any of the objects within the server. It can be used to
visually observe and analyze query plans and optimize the database performance, among
others. SQL Server Management Studio can also be used to create a new database, alter any
existing database schema by adding or modifying tables and indexes, or analyze
performance. It includes the query windows which provide a GUI based interface to write
and execute queries.
There are lots of settings you may set regarding your database, but the only information you
must fill in is the name of your database:
5.5 Backup/Restore
Database Backup and Restore:
Microsoft Access is used by programmers and non-programmers to create their own simple
database solutions.
CUSTOMER
o CustomerId (PK)
o FirstName
o LastName
o Address
o Phone
o PostCode
o PostAddress
PRODUCT
24
25 Microsoft Office Access
o ProductId (PK)
o ProductName
o ProductDescription
o Price
o ProductCode
ORDER
o OrderId (PK)
o OrderNumber
o OrderDescription
o CustomerId (FK)
ORDER_DETAIL
o OrderDetailId (PK)
o OrderId (FK)
o ProductId (FK)
ODBC Connection:
The following SQL Query inserts some example data into these tables:
--CUSTOMER
INSERT INTO [CUSTOMER] ([FirstName],[LastName],[Address],[Phone],[PostCode],[PostAddress])
VALUES ('Per', 'Nilsen', 'Vipeveien 12', '12345678', '1234', 'Porsgrunn')
GO
27
28 Creating and Using Tables
--PRODUCT
INSERT INTO [PRODUCT] ([ProductName],[ProductDescription],[Price],[ProductCode]) VALUES
('Product A', 'This is product A', 1000, 'A-1234')
GO
INSERT INTO [PRODUCT] ([ProductName],[ProductDescription],[Price],[ProductCode]) VALUES
('Product B', 'This is product B', 1000, 'B-1234')
GO
INSERT INTO [PRODUCT] ([ProductName],[ProductDescription],[Price],[ProductCode]) VALUES
('Product C', 'This is product C', 1000, 'C-1234')
GO
--ORDER
INSERT INTO [ORDER] ([OrderNumber],[OrderDescription],[CustomerId]) VALUES ('10001', 'This is
Order 10001', 1)
GO
INSERT INTO [ORDER] ([OrderNumber],[OrderDescription],[CustomerId]) VALUES ('10002', 'This is
Order 10002', 2)
GO
INSERT INTO [ORDER] ([OrderNumber],[OrderDescription],[CustomerId]) VALUES ('10003', 'This is
Order 10003', 3)
GO
--ORDER_DETAIL
INSERT INTO [ORDER_DETAIL] ([OrderId],[ProductId]) VALUES (1, 1)
GO
INSERT INTO [ORDER_DETAIL] ([OrderId],[ProductId]) VALUES (1, 2)
GO
INSERT INTO [ORDER_DETAIL] ([OrderId],[ProductId]) VALUES (1, 3)
GO
INSERT INTO [ORDER_DETAIL] ([OrderId],[ProductId]) VALUES (2, 1)
GO
INSERT INTO [ORDER_DETAIL] ([OrderId],[ProductId]) VALUES (2, 2)
GO
INSERT INTO [ORDER_DETAIL] ([OrderId],[ProductId]) VALUES (3, 3)
GO
INSERT INTO [ORDER_DETAIL] ([OrderId],[ProductId]) VALUES (3, 1)
GO
INSERT INTO [ORDER_DETAIL] ([OrderId],[ProductId]) VALUES (3, 2)
GO
INSERT INTO [ORDER_DETAIL] ([OrderId],[ProductId]) VALUES (3, 3)
GO
select * from PRODUCT
select * from [ORDER]
select * from ORDER_DETAIL
[End of Example]
Just as functions (in programming) can provide abstraction, so database users can create
abstraction by using views. In another parallel with functions, database users can manipulate
nested views, thus one view can aggregate data from other views.
Syntax:
CREATE VIEW <ViewName>
AS
30
PerformanceStored Procedures are usually more efficient and faster than regular SQL
queries because SQL statements are parsed for syntactical accuracy and precompiled by the
DBMS when the stored procedure is created. Also, combining a large number of SQL
statements with conditional logic and parameters into a stored procedure allows the
procedures to perform queries, make decisions, and return results without extra trips to the
database server.
SecurityWhen creating tables in a database, the Database Administrator can set EXECUTE
permissions on stored procedures without granting SELECT, INSERT, UPDATE, and DELETE
permissions to users. Therefore, the data in these tables is protected from users who are not
using the stored procedures.
31
32 Creating and using Stored Procedures
Stored procedures are similar to user-defined functions. The major difference is that
functions can be used like any other expression within SQL statements, whereas stored
procedures must be invoked using the CALL statement.
Example:
AS
/*-------------------------------------------------------------------------
Last Updated Date: 2009.11.03
Last Updated By: [email protected]
Description: Get Customer Information from a specific Order Number
-------------------------------------------------------------------------*/
SET NOCOUNT ON
[End of Example]
prevent changes (e.g. prevent an invoice from being changed after it's been mailed
out)
log changes (e.g. keep a copy of the old data)
audit changes (e.g. keep a log of the users and roles involved in changes)
enhance changes (e.g. ensure that every change to a record is time-stamped by the
server's clock, not the client's)
enforce business rules (e.g. require that every invoice have at least one line item)
execute business rules (e.g. notify a manager every time an employee's bank account
number changes)
replicate data (e.g. store a record of every change, to be shipped to another database
later)
enhance performance (e.g. update the account balance after every detail transaction,
for faster queries)
33
34 Creating and using Stored Procedures
Microsoft SQL Server supports triggers either after or instead of an insert, update, or delete
operation.
Syntax:
CREATE TRIGGER <TriggerName> ON <TableName>
FOR INSERT, UPDATE, DELETE
AS
User-defined functions in SQL are declared using the CREATE FUNCTION statement.
Syntax:
CREATE FUNCTION <FunctionName>
(@Parameter1 <datatype>,
@ Parameter2 <datatype>,
)
RETURNS <datatype>
AS
35
E-mail: [email protected]
Blog: http://home.hit.no/~hansha/