Chapter 3 Lecture Notes With Comments New
Chapter 3 Lecture Notes With Comments New
Business Intelligence
Your organization needs business intelligence (BI) - collective information about your customers,
your competitors, your business partners, your competitive environment, and your own internal
operations. Business intelligence gives your organization the ability to make effective, important,
and often strategic business decisions.
Instructor’s Comment: We discussed business intelligence in Chapter 1 and we will continue to discuss business
intelligence throughout this course. Databases are an essential source of business intelligence, particularly larger
databases like data warehouses. You will be tested on the definition of business intelligence and on how business
intelligence is created.
OLTP is a system that manages transaction-oriented tasks on the Internet. OLTP typically involves
inserting, updating, and/or deleting information in the day-to-day operation of the organization.
Examples of OLTP transactions systems are online banking, online booking, and retail point of sale
(POS).
What is OLTP?
https://database.guide/what-is-oltp/
:
What Is an OLTP System?
https://docs.oracle.com/database/121/VLDBG/GUID-0BC75680-5BD4-43A9-826F-
CD8837D30EB2.htm#VLDBG1367
Operational database
An operational database is used to manage and store data in real time. It is designed to run the
daily transactions and allow you to add, change or delete data in real-time. Operational databases
are OLTP databases. Microsoft SQL server is an example of an operational database.
Operational Database
https://www.educba.com/operational-database/
OLAP tools enable users to analyze information from multiple database systems at the same time to
support decision making. OLAP is a powerful tool for data discovery and predictive “what-if”
analysis. It is the technology behind many business intelligence applications. Data warehouses and
data mining, which we will cover later in the chapter, both support OLAP.
The relational database model stores data in a two-dimensional format where tables of data are
presented in rows and columns.
The tables in a relational database are logically related by using common field names. In the school
database shown below, the relationship is made between the StudentID, which is in the student
table and the registration table, and the CourseID, which is in the course table and the registration
table. The software used to maintain relational databases is a relational database management
system (RDBMS).
The following diagram shows the table structure for the Student table. The structure of the table is
also known as the data dictionary. The table structure contains the name of the field and the data
type. A field is single piece of information such as Surname or Givennames. Each field must have a
data type. Examples of data types are Text used for the Surname field and Number used for the
:
GPA field. Each data type has a property (not shown) such as the size of the Text or the format of
the Date/Time. The field called StudentID has a small key beside it because it is known as a
primary key. As we shall see, the primary key is essential in forming the relationships between
tables.
Instructor’s Comment: In our first computer assignment using the software package MS Access you created a database
containing several tables. The logical structure of each table is designed in order to create a relationship between the
tables. The midterm exam may include a written question asking you to explain the table structure and the relationships
between the tables..
The diagram below shows the two-dimensional Student table. The table has rows, one record for
each student, and columns, the field names that make up the record.
Data Dictionary
In a relational database model, a data dictionary defines the basic organization or structure of a
database. A data dictionary must be created first before the records can be entered into the table. A
data dictionary includes the field name and the field type (or data type). In the diagram of the data
dictionary, shown again below, the field PhoneNumber has a data type of Text. You would expect
that the PhoneNumber field would be the field type of Number but it is not because it is not
composed entirely of numbers, ex) 604-939-6633.
:
Instructor’s Comment: The definition of data dictionary will most likely be tested. You should also be able to identify
which diagram represents the data dictionary.
Primary Key
A primary key is a field that is unique to each record. For example, there may be two students called
John Lam but they each will have a unique student number. When a primary key of one table is
used in another table it is called a foreign key. For example, the StudentID is the primary key of the
Student table but will be used as a foreign key of the Registration table. The foreign key does not
have a key placed beside it because it is not a primary key. The connection between the primary key
of one table and the foreign key of another table helps build relationships between tables.
Instructor’s Comment: Understanding the importance of the primary key and the foreign key to the relational database is
essential. In the next slide you will see a diagram of the three tables of the student databases connected in a
relationship between the primary key and the foreign key. This is a very good question for the written section of the
midterm exam.
In the diagram below, the registration table and course table have been added to the database. In
the relationship view you can see how the tables are connected. There is one primary key for each
table. StudentID is the primary key for the student table, CourseID is the primary key for the course
table, and RegistrationID is the primary key for the registration table. In addition, the registration
table has two foreign keys, StudentID and CourseID. StudentID, the foreign in the registration table,
is attached to StudentID, the primary key in the student table. Correspondingly, CourseID, the
foreign key in the registration table is attached to CourseID, the primary in the course table. The
registration table connects all three tables together by creating a relationship between primary key
and foreign key.
:
Integrity Constraints
Integrity constraints are a set of rules used to maintain the quality of information. Integrity
constraints ensure that the adding, updating, and deleting of data does not affect data integrity. For
example, integrity constraints will ensure that no two students will have the same StudentID and that
a StudentID in the registration table must have a matching StudentID in the student table.
Integrity Constraints
https://www.javatpoint.com/dbms-integrity-constraints
Instructor’s Comment: By defining the logical structure of information in a relational database you are also developing
integrity constraints. Be sure you can explain how integrity constraints can help ensure the quality of data. As shown on
the next slide, you can also set the referential integrity to ensure that the validity of the relationship remains intact.
In the diagram below you can see how MS Access allows you to edit the relationships between
tables and enforce referential integrity. This ensures that the validity of the relationship between the
tables remains intact by prohibiting changes to the primary key that would adversely affect this
:
relationship. The symbols 1 and ∞ will appear in the relationship representing the one-to-many
relationship of the information. This means that one student may appear many times in the
registration table, depending on how many courses they register for, but they can appear only one
time in the student table.
A database management system (DBMS) helps you to specify the logical organization for a
database and to access and use the information within a database. Microsoft Access is one
example of a DBMS. With Microsoft Access you can create the structure of the table (data
dictionary), you can enter, edit, and delete the records, and you can create forms, reports, and
queries. A DBMS contains five important software components.
1. DBMS Engine
2. Data definition subsystem
3. Data manipulation subsystem
4. Application generation subsystem
5. Data administration subsystem
Instructor’s Comment: I often ask questions about DBMS Tools on the midterm exam. You could expect to be tested on
both multiple choice questions and written questions. On the next few slides we will look at each of the five components
of the DBMS. Note that four of the components are also called subsystems due to their dependence on the DBMS
engine to access the database.
DBMS engine
:
The DBMS engine is perhaps the most important component of a DBMS. The DBMS engine
accepts logical requests from the four DBMS subsystems and converts them into their physical
equivalent to access the database as it exists on the hard disk. The physical view of information in a
database is how the information is physically arranged on the hard disk. The logical view of
information in a database, on the other hand, focuses on how you access the information from the
database.
Instructor’s Comment: The DBMS engine accepts logical requests from the other four DBMS components or
subsystems and converts them into their physical equivalent. What is a logical request? The logical view of the
database is how we see the database. We see a relational database consisting of two-dimensional tables that are
connected by a primary key and a foreign key. Everything we create using a DBMS uses a logical view including forms,
queries, and reports. However, when we save the database to the storage device, it is not stored in a logical form but
rather it is stored in a physical form. The disk drive for example uses tracks, sectors, and clusters to store the database.
You will almost definitely be tested on the role of the DBMS engine.
The data definition subsystem of a DBMS helps you create and maintain the data dictionary and
define the structure of the files (tables) in a database. When you create a database, you must first
create the data dictionary and define the structure of the files (tables).
The data manipulation subsystem of a DBMS helps you add, change, and delete information in a
database and query it for valuable information. It is the data manipulation tools within a DBMS that
allow you to specify the logical information requirements in a database. In most databases you will
find a variety of data manipulation tools, including views, report generators, query-by-example tools,
and structured query language.
Instructor’s Comment: Most of the tools that we use to create a database are included in the data manipulation
subsystem. As a result, there could be several questions about this subsystem on the exam. The data manipulation
tools are discussed on the next few slides.
Views
A view (form) allows you to enter information into a record, add new records, and edit or delete a
record. The image below shows the form for the Student table. Forms allow you to view a single
record at time. Using a form to view a record is preferable when the record has many fields.
:
Query-by-example tools
Query-by example (QBE) tools allow you to graphically design a query. When you create a query
you must choose one or more tables, select the fields from each table, and enter a criteria. You can
see in the query example below that all three tables have been selected and appear at the top of the
query. Several fields have been selected from the tables, such as the student’s Surname and
Givennames, as well as the Room, Day, and Time. Finally, you can see the criteria, “Busi 237”, has
been entered under the field CourseName to select only records matching students who registered
for that course. In addition, the records have been sorted by both the Surname field and
Givennames field in Ascending order.
Report Generators
:
Report generators help you to define the format of a report. You can create a report from a table or
from a query. A report allows you to group the information from a query and may include subtotals
for each numeric group.
SQL is found in most DBMSs. SQL can perform the same query as in the example above except
that you perform the query by creating a statement instead of pointing, clicking, and dragging. Below
is a simple example of SQL.
Instructor’s Comment: You are likely to have a multiple choice or written questions about the data manipulation tools,
especially query-by-example and report generators. In fact you may be asked some detailed questions about the query-
by-example that you did on your computer assignment #1. Be sure you understand what criteria is. There may be a
multiple choice question about SQL.
The application generation subsystem of a DBMS allows you to build a user interface between the
end-user and the DBMS. For example, Visual Basic for Applications (VBA) allows you to create a
user interface between the end-user and MS Access to allow a much more user-friendly view of the
:
database. In the images below you can see an example of VBA code in MS Access on the left. This
code creates a simple age calculator program in a MS Access form.
Instructor’s Comment: You are likely to have a multiple choice or written questions about the data manipulation tools,
especially query-by-example and report generators. In fact you may be asked some detailed questions about the query-
by-example that you did on your computer assignment #1. Be sure you understand what criteria is. There may be a
multiple choice question about SQL.
The data administration subsystem of a DBMS allows the database administrator to backup and
restore the database, help identify who has access to what information, perform security,
optimization, and validation duties, and assess the impact of structural changes to a database.
Instructor’s Comment: You may asked for an example of the tools available to the database administrator, such as
backing up the database and performing security
Data Warehouses
A data warehouse is a logical collection of information, gathered from many operational databases,
is used to create business intelligence that supports business decision making. A data warehouse is
a fundamentally different way of thinking about data.
Instructor’s Comment: Data warehouses are very popular with organizations because they are such a powerful source
of business intelligence. Data warehouses represent a fundamentally different way of thinking about organizing and
managing information in an organization. Data mining tools, discussed at the end of this chapter, are used to search the
vast amounts of information from within a data warehouse. Combined together, data warehouse and data mining would
be a very good written question for the midterm exam.
Data warehouses are multidimensional. The layers in a data warehouse represent information from
different perspectives. The multidimensional information in a data warehouse is referred to as a
hypercube. You can see from the example below that data warehouses contain vast amounts of
information, such as the comparison of product sales by territory based on advertising.
:
Instructor’s Comment: You need to know that data warehouses are multidimensional and contain information from many
operational databases as well as other information that will help in building business intelligence.
Databases are transaction-oriented and support OLTP and, therefore, are operational databases.
Data warehouses are not transaction-oriented. They exist to support decision-making tasks in your
organization. Therefore, data warehouses support OLAP, the manipulation of information to support
decision-making.
Instructor’s Comment: You might be asked to compare a database and a data warehouse.
Data Mining
Data mining tools are used to query information in a data warehouse. The four data mining tools on
the following pages support the concept of OLAP.
:
Query-and-reporting tools
Query-and-reporting tools are similar to QBE tools, SQL, and report generators in a typical database
environment.
Artificial Intelligence
Intelligence agents utilize various artificial tools such as neural networks and fuzzy logic to form the
basis of "information discovery" and building business intelligence in OLAP.
Multidimensional analysis tools are slice-and-dice techniques that allow you to view
multidimensional information from different perspectives. Essentially "turning the cube".
Statistical tools
Statistical tools help you apply various mathematical models to data warehouse information. For
example, you can perform a time-series analysis to project future trends.
Instructor’s Comment: You will most likely need to discuss data mining and data warehouse together in a single test
question..
: