Seminar 7 Introduction To Databases
Seminar 7 Introduction To Databases
Systems BISM1201
Seminar 7 – Introduction to databases
2
Introduction to databases
• The amount of business data that needs to be stored and processed has consistently increased each year.
• Includes multiple types of data – sales, purchases, salary activities, client inquiries, and much more. Data is increasing
exponentially with the popularity of digital, online business, and digital traces (e.g. from social media)
• This business data needs to be stored, and for the past several decades, this storage has been the database.
• The database has been spectacularly successful and is now considered very mature technology. However, we must
recognise that this trend has been – and still is, centred upon the storage of transaction data. This is the fundamental
Apurpose
collection
of aof data organised
database – to store to service many
transaction data ofapplications
multiple types.at the same time by storing and
managing data so that they appear to be in one location (Laudon and Laudon 2020)
• Businesses have realized that, within this huge volume of database -stored transaction data – there is an additional,
highly valuable commodity. This commodity is called “business intelligence” and it is becoming critically important to
A collection of related information. The information held in the database is stored in an organised
business. Business intelligence is obtained by analysing transaction data – by extracting trends or patterns or
way so that specific items can be selected and retrieved quickly (Bocij et al., 2019)
characteristics that are not explicitly recorded in the database – they are ‘discovered’ via analysis of the stored
transactions. An example would be: “who are our ten best customers” or “what our best selling product is”.
• Business intelligence is extremely valuable to businesses because it allows a business to better know its customers.
Business would say: “it allows us to get closer to customers” to becomes better aware of how its customers make
decisions. Consequently business intelligence is a strategic capacity for the modern business that wants to stay
current with new trends and stay agile in terms of its response to customer expectations.
3
Exponential growth of data
• Google CEO Eric Schmidt said in 2010, we now create in two days as much information as humanity
did from the beginning of recorded history until 2003 (measured bits)
• People and businesses can create data far faster than any business process can assemble, digest, or
act on.
• Moore’s law means more processing, which creates more data.
• The number of devices connected to IP networks will be more than three times the global population by
2023. There will be 3.6 networked devices per capita by 2023, up from 2.4 networked devices per capita
in 2018. There will be 29.3 billion networked devices by 2023, up from 18.4 billion in 2018.
• M2M connections will be half of the global connected devices and connections by 2023
• Mobile (cellular) speeds will more than triple by 2023.
Cisco Annual Internet Report (2018–2023)
4
Opportunities and problems
Opportunities Problems
• Centralisation of data points – GPS, • Despite all the money spent on ERP, data warehousing,
“real-time” systems, most managers cannot trust their data.
social media, likes, sites visited,
• Multiple spreadsheets, databases capturing similar
movement etc.
information across a business, data quality, timeliness
• Visualization is improving, making • For all the hype, data, computing power and tools, getting
data more valuable and easier to act data to tell a story is still hard.
on • In business, entities collect data for their own purposes, label
and format it in often-nonstandard ways, and hold it locally,
• Getting easier: Google analytics,
usually in Excel but also in emails, of pdfs, or production
brings marketing tools into the owner
systems.
of small business. Anyone can slice, • Data synchronization efforts can be among the most difficult
and dice ad and revenue data from of a chief information officer’s tasks, with uncertain payback.
dozens of angles. • Getting the right data can be a challenge.
5
The value of database
The database is the foundation of the paradigm of business intelligence.
• Business analysts have two critical reasons to be familiar with the logical design of an efficient
database.
• Firstly, the database is critical to the storage of transaction data that is vital to business operation.
• Secondly, the database now can be used to provide this huge volume of accumulated transaction data
for further analysis – which in turn produces this new, highly valued commodity of business
intelligence .
• All information systems, whether it is an enterprise resource planning system, or Facebook rely on a
core repository to keep the data.
• Definition of database: Software for storage and retrieval of information. All organizations require the
means to store, organise, and retrieve information.
6
What do databases do beyond storage?
• Search, add, amend, delete records
• Multi-user access, distributed access, speed, data quality, security, space
efficiency
• Relational databases enable data to be stored within a number of different
tables. Separate record designs can be used to store dealing with different
subjects. For example, a database used for stock control might use separate
record designs to store information concerning items stocked, reorder levels
and supplier details.
7
Example of what a database can do
• Facebook is giant database at heart, with a front end program that takes data from that database and
arranges it on a webpage. Think about the information it has:
• Your profile information, such as your name, DoB, username, password, gender, where you live(d) etc.
• Events in your timeline
• Photos which can be linked to events or people
• Friend connections, A is a friend of member B, and so on
• What users are following, e.g. persons, groups, etc
• Likes, which one member likes a comment, photo, or other content.
• This are stored in many database tables.
Imagine the power of a query like: show me persons aged under 25, that live in Australia, that like
running to provide running shoe advertising to.
Mallach 2016
8
“Old” and “new” business data architecture
‘Old’ Business Data Architecture ‘New’ Business Data Architecture
9
‘Old’ Business Data Architecture
• Business has consistently organized along functional, or program lines i.e. a business would have an
accounts department, a sales department, a HRM department, and so on.
• Each of these functional or program areas would be supported by its own software system, and this system
would store the relevant transaction records within a specialized file that had been tailor-made for the
specific program area.
• Whilst this system worked well in support of the specific program area, it also posed very significant
problems for the business.
• The old solution produces a high level of redundancy, that is, the same information stored in multiple
locations. For example, a client’s details may be stored in both the accounts file and the sales file. This
redundancy in turn makes for difficulties in keeping information consistent and accurate. Our example client
may advise our accounts department of a change of address, but this information is not communicated to
the sales department. Data records quickly become fragmented and out of data – this produces errors and
increases costs. The old solution does not easily facilitate the integration of the different systems. That is,
there is little or no interconnectivity. The business should be a single entity, but it is very difficult to get a
single view of the operations of the business. This is a strategic disadvantage for any business.
10
‘New’ Business Data Architecture
• All program areas store data in a single, consolidated database. This eliminates redundancy and
reduces inaccuracy. This solution provides integration and therefore enables a single view of business
– this in turn significantly raises the quality of management analysis and decision making.
• The modern database is now considered one of the most successful business software solutions of the
past 30 years or more.
11
The database – a success story
• This image shows the logical software architecture for just about all online businesses. It shows a
web browser interacting with a corporate web server.
• This is what happens when we use Wikipedia, Google. However online businesses now combines
a web server with a database. The database is really the mission-critical component at the
business end. The web server is important, but the database is really the ‘brains’ of the process.
• What is also shown in the architecture is that the corporate database, and indeed all databases,
comprise two equally important components: the database management system and the data.
The database management system, or DBMS, is all software. The DBMS exclusively manages the
other component, that is the data. All access to the data must come through the DBMS. This
offers huge advantages.
• This architecture of the DBMS managing the data minimises many information challenges. Data
redundancy – that is data stored in multiple places, is reduced massively. Data isolation – that is,
data that cannot be accessed by certain relevant software applications, is entirely eliminated. And
finally, data inconsistency and inaccuracy – the inevitable result of data redundancy, is
substantially reduced.
• The DBMS and data also provide fundamental improvements in overall data quality. Security of
data is increased with the application of a single, corporate-wide information security program.
Data integrity, or data correctness is therefore increased. And finally data independence, that is
the creation of data in a form that is independent of any single software application, is created.
This directly enables software applications to be developed in a format that deals with a single
form of standardised database interaction.
12
The DBMS
DBMS
User makes request
searches the
for information (often
database
using applications)
User Database Management
Database
Systems (DBMS)
DBMS returns
DBMS
information to the user
retrieves the
information
13
DBMS
• DBMS is software that manages a database Oracle, IBM, Microsoft, SAP, Teradata
• Microsoft Access is a popular DBMS for personal computers
• Provides general purpose tools and utilities for producing and extracting data.
• Enables non-technical users to access data.
• Application access a database through a DBMS
• An application that needs data sends a message to the DBMS saying what it requires, often using
Structure Query Language (SQL)
• For instance, the query: SELECT NAME, EMAIL FROM STUDENT WHERE COURSE = “BISM1201”
AND GRADE IS => B
• Would list all the students taking BISM1201 with a grade higher than a B
14
Databases – an operational example
• A university information system that, whilst very much
simplified, still describes the central themes.
• The database, comprising the DBMS and data is indicated.
The data contains a diverse range of data, e.g. course details,
enrolment records, through to union information, which is
managed by the DBMS – note the arrows. No access to the
data unless this is done via the DBMS.
• The application software involves three applications, with
each used by a different set of users. This demonstrates data
independence – presumably all three applications have been
created in different projects and even by different vendors.
But this is not a problem, because the DBMS provides a
single view of the data, and therefore standardized data
access.
• Database – DBMS and data • Finally consider the users (employees) e.g. in the registrar’s
office, the accounting group, the student union office. The
• Application data – providing data independence DBMS easily accommodates the geographical distribution of
end-users. The number of users may also grow – for
example, new organization units may be created. Again the
• End-users, geographically distributed
DBMS accommodates this growth easily, we say that the
solution scales well.
15
The relational database – a definition
• We have introduced the ‘generic’ database concept.
There are many different database types and many
different database vendors. However, in business,
there is clearly one dominant type, and this is the
relational model, or relational database. Others include
object-oriented databases and increasingly
blockchain (which strictly speaking aren’t database)
• The definition of the relational model is very simple.
• A relational database presents its data to the user
one, or more two-dimensional tables. This simply
means that each table is shown to a user, human or
software, as rows and columns.
• For a relational database to work efficiently, there will
always be multiple tables. The idea of one huge table Data represented as one or more two-
within a relational database is simply not workable.
• We will explore this concept when we focus on the dimensional tables with columns and rows
logical design of the database, or data modeling.
16
Some definitions
• Data base management system: The data held in database is accessed via a database management
system. A DBMS is one or more computer programs that allow users to enter, store, organise,
manipulate and retrieve data from a database. (some people use the words DBMS and database are
used interchangeably (Bocij et al. 2019)
• Data base management system: Special software to create and maintain a database and enable
individual business applications to extract the data they need without having to create separate files or
data definitions on their computer programs (Laudon and Laudon 2019)
• Relational database: a type of logical database model that treats data as if they were stored in two
dimensional tables. In can related to data stored in one table to data in another, as long as the tables
share a common data element. (Laudon and Laudon 2019)
• Relational database: data stored within a number of different tables with each dealing with different
subjects that are related (linked) using key fields (Bocj et al., 2019)
17
Why do we need to know about databases?
• The business analyst needs to fully appreciate that a business relational database can only operate
efficiently if its logical design is complete and correct.
• This introduces the business activity of data modelling, which is very similar in many ways to the task of
process modelling.
• For our data modelling to be correct, we must understand the data that our business needs to capture,
store, and process.
• We must appreciate the specifications of this data, and also the interconnections – we shall say
relationships – that exist within the overall data sets.
• Decisions on business rules for a database are not made by technicians but by analysts. For instance,
should an owner of a video on a video streaming service be able to rate/review their own video?
18
Relational database – model characteristics
• One or more two-dimensional (rows and columns) tables
• Tables contain records or rows
• Rows contain 1 or more fields, characteristics or attributes
• A design hierarchy: tables->rows->fields
• More powerful than a spreadsheet
• Very versatile
• More efficient in data processing
• Can this arrangement tell us who sent Email Num1?
• Can this arrangement tells us which students visited on 17
Feb?
19
Characteristics of a relational database
• The relational database comprises one or more two - dimensional tables, which means that each table displays as a series of rows and columns. In
our graphic, we have a “student” database. As it stands now, this database offers limited use– however it does illustrate some very important
characteristics.
• Firstly, the displayed database satisfies the basic definition of the relational model. It has three tables “Email”, “Student”, and “Office_Visit”. A table in
this definition is frequently referred to as a “file”. The relational database nearly always has more than one table. It is usually not feasible to correctly
make a database that contains only one table. The efficiency of the relational database model relies upon multiple tables – not a single table.
• Next, note that each table contains one or more records – these are the rows of each table. The email table has three rows, the student table has
eight rows or records, and finally the office visit table has three records.
• From here, note that each record or row contains one or more fields, or characteristics or attributes. All these terms mean the same thing in the
context of a relational database. A field describe some detail or characteristic of a record.
• This organisation of a relational database clearly forms a design hierarchy: the overall database comprises tables, 3 which in turn comprise records,
and finally the records are made up of fields, characteristics or attributes.
• In a very clear sense, appreciate that a relational database is far more powerful that a spreadsheet. A good business analyst uses both to
complement each other within the overall data processing challenge.
• A relational database is also very versatile. It can store a huge volume of data, and also accommodate a huge variety of data types. It is fair to say
that a database can store any digital data.
• A database is far more efficient in processing data in comparison to a spreadsheet. A database is record centred, and this allows processing to occur
at the record level. A spreadsheet, however, will recompute the value of all cells whenever even a single cell is updated. This is fine for desktop
processing, however it is simply not workable for large-scale organizational information processing.
20
Characteristics of a relational database
• The relational database always has a database management system, or DBMS, and this exclusively supervises all
access to the data. This of course means that all users of the database – remembering that users may be people or
software applications, must ‘ talk’, that is communicate, with the database. That is, the users must request service from
the database.
• The total range of database services provided to all users may be considered as comprising four categories. The
categories are: create new records in the database, update existing records in the database, read and display existing
records in the database, and finally, delete an existing record or records within the database.
• For a user – which is either a human user or a software user, to communicate or talk to the database, both the database
and the user must speak the same language. To make this exchange practical in a world where there are thousands of
software vendors, and also a large number of relational database builders, the overall industry has agreed on a
standardised database language. This means that we have a standardised language that all databases should
implement, and that all software applications should use. This language is called Structured Query Language or, SQL.
This language is also the reason for data independence, that is, we can design software applications without concerning
ourselves as to the internal design details of a relational database.
• There is also another important benefit provided by SQL. This is that SQL is a very simple language and therefore
provides for the simplification of very complicated requests to the database. Again, this is very important for overall
efficiency and correctness.
21
Logical modelling of the relational database
• Let’s now discuss the logical modelling of the relational database. Remember, we must firstly logically
model our concepts before we can actually implement or build the concept as a working model. The
logical modelling is the responsibility of the business analyst.
• In logically modelling a relational database, we must firstly identify, for each table, the relevant primary
and foreign keys. Let’s firstly consider the definition of each of these key types.
22
Relational database keys – Primary keys
• A field (e.g. Student_id) that uniquely identifies each record in a table.
• A column or group of columns that identifies a unique row in a table.
• Must be unique, there can be no duplication of attributes in the column chosen as the primary key for
the table
• Usually the primary key is the name of the table or an identifier e.g. ID, reference, code etc.
• There is usually one primary key per table; unless several columns/fields are used – this is called a
compound primary key.
• Each table must have a primary key – it is not possible to create a table without specifying a primary
key.
• The database overall is not operational without a primary key for each table.
23
Relational database keys – Foreign keys
• They are needed to link or form relationship between the tables within the database
• The foreign key always links to a primary key in another table
• There may be 0, 1 or several foreign keys in a table
• A field that is used to link tables, by linking to a primary key in another table
• The foreign key is used to link tables together by referring to the primary key in another database table.
• Primary keys are reasonably simple. A primary key must be unique, there can be no duplication of
attributes in the column chosen as the primary key for the table. Foreign keys are a little more obscure
as a concept.
24
Primary and foreign keys – an example
Email Table
• Primary key – for each table.
EmailNum
1
Date
2/01/2017
Message
For homework 1, do you want us to provide notes on our references?
What would be the foreign key?
2 3/10/2017 My group consists of Swee Lau and Stuart Nelson
3 3/10/2017 Could you please assign me to a group? • Why do we also need foreign
Student Table
Student
keys? Query: “Identify all emails
Number Student Name HW1 HW2 MidTerm
sent by Andrea Baker”. Cannot
1325 BAKER, ANDREA 88 100 78
1644 LAU, SWEE 75 90 90 be answered until foreign keys
2881
3007
NELSON, STUART
FISCHER, MAYAN
100
95
90
100
98
74
are identified!
3559 TAM, JEFFREY 88 100 88
4867 VERBERRA, ADAM 70 90 92 • Add a foreign key to each table
5265 VALDEX, MARIE 80 90 85
8009 ROGERS, SHELLY 95 100 98 “Email” and “Office_Visit”. Query
Office_Visit Table now answered!
VisitID Date Notes
2
3
2/02/2017
2/02/2017
Andrea had questions about using IS for raising barriers to entry
Jeffrey is considering his IS major. Wanted to talk about career opportunities
• A foreign key is a primary key in
4 2/11/2017 Will miss class Friday due to job conflict another table that enables linking
across multiple tables.
25
Primary and foreign keys – an example
Email Table
EmailNum Date Message
Student
Number
1 2/01/2008 For homework 1, do you want us to provide notes on our references? 1325
2 3/15/2008 My group consists of Swee Lau and Stuart Nelson 1325
3 3/15/2008 Could you please assign me to a group? 1644
Student Table
Student Student Name HW1 HW2 MidTerm
Number
1325 BAKER, ANDREA 88 100 78
1644 LAU, SWEE 75 90 90
Primary key 2881 NELSON, STUART 100 90 98 Foreign key
3007 FISCHER, MAYAN 95 100 74
3559 TAM, JEFFREY 88 100 88
4867 VERBERRA, ADAM 70 90 92
5265 VALDEX, MARIE 80 90 85
8009 ROGERS, SHELLY 95 100 98
Office_Visit Table
VisitID Date Notes
Student
Number
2 2/13/2008 Andrea had questions about using IS for raising barriers to entry 1325
3 2/17/2008 Jeffrey is considering his IS major. Wanted to talk about career opportunities 3559
4 2/17/2008 Will miss class Friday due to job conflict 4867
26
Data model and entity relationship diagram
• Data model: A diagram that represents the main items of interest to us – we call these entities in
the database - and the relationships that connect the entities (the main items of interest)
27
Why data modelling and ERD?
• Business analysts review ERDs to make sure they reflect business needs. For example,
should comments on videos be by one member only? That would work, but suppose an
infrastructure used the same approach to record students visiting her office and the topics
they discussed. Is it reasonable for each office visit to be by one only students? What if
students working in a group came together? A many-to-many relationship between students
and office visits might be better.
• Understanding the structure of a database is a good basis for developing it.
• Business analysts and database designers use ERDs to communicate their ideas when they
design database for business needs.
• You will use databases during your career. You will use DBMS, work with database
administrators, and therefore you need to know how they fit into business and information
systems picture.
28
Communicating ideas through an ERD
• Analysts use ERD to communicate their
ideas when they are designing a
database. It helps to visualise and show
other stakeholders the rules inscribed in
the database. Changing ideas on paper/in
a model is much easier than changing a
flaw after the database has been built.
The more feedback the better the refined
ERD will be.
29
ERD for English Premier League football players
What rules are codified here?
League Club Sponsor
• Each player may only be contracted to
one club at a time
• A club may contract a number of
players
• Players can play for their national
teams, but may only play for one
national team
• Players may have a number of
National Sponsorship sponsors
Player
team agreement • Each sponsor may sponsor a number
of players
• Each club may only play in one
(national) league
30
ERD definitions
An ERD data model is a relatively simple construction and has a working set of three fundamental constructs:
1. Entity: is anything of interest to the users of the database. An entity is a person, place, thing or event to be recorded
in the database. Entities will almost always occur as the nouns, or naming words, in the user’s description of their
work e.g. in a university environment, a user may speak of students, courses, lecturers, and degrees. These would
all be entities in a data model. But they are not an instance (e.g. an individual person). Each entity in the data model
will probably be represented in our production database as a table. Entities are rectangles.
2. Attribute: a specific characteristic or quality of a particular entity e.g. in the earlier student example, characteristics
would include student_name and student_address to distinguish one instance of a student from another we give
them a different name, address, and so on. A unique one, or unique combination of two or more attributes will form
the primary key of our entity, and later the table in the database.
3. Relationship: connects two entities. Relationships are almost always defined by the business rules of the
organization. The line signifies that the entities may be used together in the operational database. Relationships are
lines that connect two entities.
4. Cardinality: Notations are used to describe the minimum and maximum numbers involved in the relationship. This
‘number’ is referred to as the cardinality of the relationship.
31
Seminar Question 1
Question 1: The graphic below shows the ‘old’ and the ‘new’ logical design of the corporate database.
What are the advantages of moving from the ‘old’ to the ‘new’? Your answer should be centred upon the
functional design of business (what does this mean), the issue of information redundancy (what is it),
and the problems this design causes.
‘Old’ Business Data Architecture ‘New’ Business Data Architecture
32
Seminar Question 1 Answer
• Business organization: functional, or program lines - an accounts department, a sales
department, a HRM department, and so on.
• Each supported by its own software system – storing the relevant transaction records - within a
specialized file.
• Good support for the specific program area - posed very significant problems for the business -
information redundancy (continued next slide)
• Redundancy: difficulties in keeping information consistent and accurate - data records quickly
become fragmented and out of date – this produces errors and increases costs.
• Old solution problematic integration of the different systems – no single view of the business
operation – strategic disadvantage.
• The ‘new’ approach - a profound change - a single, consolidated database - eliminates
redundancy - reduces information inaccuracy - provides integration - enables a single view of
business data – raises the quality of management analysis/decision making.
33
Seminar Question 2 Answer
Question 2: The graphic below (again from the video) shows an operational example of the corporate
database (a university information system that shows three central themes within the overall database
design). Explain the three themes (one in each oval) and describe the major functions and advantages
within each theme
1st oval (theme 1) – users‐the employees ‐
Answer: registrar’s office ‐the accounting group ‐the
student union office. Same building or different
locations. This indicates the geographical
distribution of end‐users. The solution is scalable.
2nd oval (theme 2) –software applications ‐each
used by a different set of users ‐demonstrates data
independence integrating different solutions ‐
standardized data access.
3rd oval (theme 3) ‐the database, comprising the
DBMS and data ‐ diverse range of data ‐all of this
data is exclusively managed by the DBMS note the
arrows. There is no access to the data unless this is
Users Software apps Database done via the DBMS
34
Seminar Question 2 Answer (Cont)
• Huge advantages. Data redundancy – high integration and availability Improvements in overall data
quality.
• Security of data is increased with the application of a single, corporate‐wide information security
program.
• Data integrity, or data correctness is therefore increased.
• Data independence, that is the creation of data in a form that is independent of any single software
application, is created.
• Security of data is increased with the application of a single, corporate‐wide information security
program.
35
Seminar Question 3
Question 3: The generic corporate database is probably the biggest IS success story in business over the past
several decades.
(1) Discuss the reasons for this success – including why the database and not the spreadsheet (that appeared just
after the database) proved so vital to corporate business?
• First databases – 1980s – “mission critical”; Levels of data increasing each year; Multiple types of data
(2) Is this success (i.e. the need for the corporate database) increasing – or not? If increasing, why?
• Corporate database ‐ universal solution; Strategic edge; Operational disadvantage; Storage of transaction data;
“Business intelligence”; Organizations can better know customers; Business intelligence is a strategic capacity
(3) Within the generic corporate database paradigm, we have the relational database. How does this relational
database fit within the generic database concept? How do we define the relational database?
• “Relational model is defined very simply; presents data to the user as one or more two dimensional tables; multiple
tables, data modeling
36
Seminar Question 4
Make a list of databases in which data about you is captured? How is the data captured? How could this
data be used to generate insights (business intelligence) about you?
37
Seminar Question 5
A relational database is made up of
a. Worksheets
b. Documents
c. Tables
d. Files
38
[Entity Name]
Seminar Question 6
A _____is a person, place or thing about whom or about which a firms wants to store information
A. Entity
B. Foreign key
C. Primary key
D. Business intelligence
39
Seminar Question 7
________ dictate the type of relationships between entities
A. Customers
B. Entities
C. Business rules
D. All of the above
40
Recap definitions
Entity: a group of related data, such as customer entity, implemented as a table (e.g. Students)
Attribute: A property or characteristics of an entity, implemented as a field (e.g. Student Name)
Relationship: Describes how different tables are linked
Primary key: A field (e.g. Student_id) that uniquely identifies each record in a table.
Foreign key: a field that is used to link tables, by linking to a primary key in another table
Compound key: a combination of fields to form a key.
Data base management system: one or more computer programs that allow users to enter, store, organise,
manipulate and retrieve data from a database.
Database: a collection of related information stored in an organised way so that specific items can be selected and
retrieved quickly.
Business Intelligence: the process of gathering enough of the right information in a timely manner and usable
form and analyzing it to have a positive impact on business strategy, tactics, decision making or operations.
Record: a physical data element composed of fields e.g. ‘student name’, ‘address’, ‘email’ gives us the record for a
student.
41