0% found this document useful (0 votes)
9 views70 pages

Foundations of Databases

The document provides an overview of advanced database management systems, covering the evolution of databases from file systems, the importance of proper database design, and the characteristics of data. It discusses various types of data, the roles of database management systems (DBMS), and the significance of data integrity and management practices. Additionally, it outlines different database models and their relationships, emphasizing the need for effective data management in organizational decision-making.

Uploaded by

Nii Adjetey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views70 pages

Foundations of Databases

The document provides an overview of advanced database management systems, covering the evolution of databases from file systems, the importance of proper database design, and the characteristics of data. It discusses various types of data, the roles of database management systems (DBMS), and the significance of data integrity and management practices. Additionally, it outlines different database models and their relationships, emphasizing the need for effective data management in organizational decision-making.

Uploaded by

Nii Adjetey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

MICT713

Advanced Database
Management Systems
SNO Devine
Department of ICT & Mathematics
Presbyterian University, Ghana
Foundations of Databases
Concepts, File system, Database Models
and Types
Outline
• What a database is, what it does, and why database
design is important
• How modern databases evolved from files and file
systems
• About flaws in file system data management
• What a DBMS is, what it does, and how it fits into the
database system
• Database terminologies and the Database Users
• Types of database systems and database models

3
Databases
• With rapid growth in computerization and digitalization,
accurate record keeping practices are needed to effectively
manage the daily assets of any organization.
• The amount of data generated and collected today is
growing exponentially.
• Managerial decision making rely on accurate data. Hence, in
order for organizations to benefit from their operational
data, the need to avoid redundancies, inconsistencies and
data loss is required.
• Cannot over emphasize the need for a proper database
infrastructure in any organization and its implications on the
organizations operations.
• First, it is better to understand some principles of databases
and what defines its design; this would lead us to know
what data and information are about.
4
DATA
• Data is the “New Gold”.
• It has become a "critical raw material for producing
digital products and services.“
• Data is the “New Oil”.
• Generally credited to mathematician Clive Humby:
• “Data is the new oil. Like oil, data is valuable, but if
unrefined it cannot really be used. It has to be changed
into gas, plastic, chemicals, etc. to create a valuable
entity that drives profitable activity. So, must data be
broken down, analyzed for it to have value.” (Humby,
2006).

5
DATA
• It is estimated that the
internet contains
massive amount of
data to the tune of 5
million terabytes
(Forbes, 2023).
• It is estimated that
about 402.74 million
terabytes of data are
created each day
(Statista, 2023).
• “Created” includes
data that is newly
generated, captured,
copied, or consumed
6
DATA: What is it?
• Data is raw fact.
• Data is a known facts that can be recorded
and have an implicit meaning.
• Data is any piece of fact that requires further
transformation in order to derive its true
value and meaning for decision making.
• Data is the foundation or basis for deriving
information.

7
Types of Data
• Data is presented in two main forms with each
having two categories
• Qualitative
• Ordinal
• Nominal
• Quantitative
• Discrete
• Continuous

8
Types of Data: QUALITATIVE
• Qualitative Data
• also termed Categorical
• is a type of data that cannot be measured or counted in
the form of numbers.
• are the types of data that are sorted by category, not by
number.
• Nominal Data
• is used to label variables without any order or
quantitative value.
• Colour of hair (Black, Brown, Blonde, etc.)
• Marital status (Single, Widowed, Married)

• Ordinal Data
• has a natural ordering where a number is present in
some kind of order by their position on the scale.
• When companies ask for feedback, experience, or satisfaction on a Likert scale of 1 to 10
• Letter grades in the exam (A, B, C, D, etc.) 9
Types of Data: QUANTITATIVE
• Quantitative Data
• also known as Numerical data.
• is a type of data that can be expressed in numerical values,
making it countable and including statistical data analysis.
• often used for statistical manipulation represented on a wide
variety of graphs and charts.
• Discrete Data
• discrete means distinct or separate
• contain the values that fall under integers or whole numbers.
• cannot be broken into decimal or fraction values.
• Total numbers of students present in a class
• Numbers of employees in a company
• Continuous Data
• are data are in the form of fractional numbers
• Height of a person
• Market share price/value 10
Data: Forms and Format
• Every organization (or even from the individual level)
may create, generate or use data which comes in
different types/forms.
• Data can come in various forms numeric and text, and
formats (Document CSV, XLS, JSON), Audio (MP3, ACC,
etc.), Image (.JPG) and Video (MP4, AVI)
• A collection of Data that comes from various sources in
large
• Volume – Size/Amount of data qualifying as big data),
• Velocity – Speed/Rate at which the data is created and how
fast it moves),
• Value – Relevance/Value the data provides),
• Variety – Forms/Diversity that exists in the types of data,
• Veracity – Quality and accuracy of the Data),
• is termed Big Data 11
Essence of Data
• Every organization needs (quality) data to survive,
innovate and grow.
• Data is necessary for
• deriving insights in operations of an organization.
• obtaining a clear picture about the past and present.
• projecting or predicting future events or occurrences.
• surety of evidence and avoidance of guess work in decision
making.
• targeted growth and development in a specific or integrated
view.
• identification of problems and designing viable solutions.
• advocacy as a key element of fact to backup an argument
• strategic alignment and increase in efficiency.
• assess resource allocation, and drive improvement and
transformation.
• effective monitoring and addressing challenges promptly.
12
Introducing the Database
• Data versus Information
• Data: raw facts
• Being stored and retrieved
• Not been processed to reveal their meaning to the user
• For example:
• Robcor company has two divisions and the two division has
1,380,456 and 1,453,907 invoices, respectively.
• Each invoice has invoice number, date, and amount
• The period is from the first quarter of 1997 to first quarter of
2002.
• Total 2,834,363 records

13
Invoice Nbr Invoice Date Sales Amount

… … …

Data 3000124 12-Jan-2002 $121.98

… … …

14
15
Introducing the Database
• Data versus Information
• Data constitute building blocks of information
• Information produced by processing data
• Information reveals meaning of data
• Good, timely, relevant information is key to good
decision making
• Good decision making key to organizational survival

16
Introducing the Database
• Qualities/Characteristics of Information
• Accurate
• Complete
• Consistent
• Concise/Summarized
• Relevant
• Reliable
• Scoped
• Time-bound
• Etc.

17
Database Management
• Data Management
• is the set of practices or act
• that focuses on the proper generation,
collection, organization, storage, protection,
retrieval, transformation and distribution of
data
• usually by an organization
• towards analysis for business decisions

18
Database
• Database
• is an organised collection of interrelated data
• is a group of related record stored for a specific
purpose
• Database can be
• manually or electronically managed
• presented as a shared, integrated computer
structure housing related data:
• End user data (raw data)
• Metadata (data about data, it contains data
characteristics and relationships)
19
Database Management
• Database Management System (DBMS):
software system (collection of software)
help to manage the data contents
• Manages Database structure
• Controls access to data
• Contains query language
Application software DBMS Database

• Example: Oracle, MySQL, MSSQL, MS Access,


Postgre, DB2, Firebase, MongoDB, Cassandra, Redis,
Google BigTable, etc
20
Importance of DBMS
• Makes data management more efficient and
effective
• Query language allows quick answers to ad hoc
queries
• Provides better access to more and better-managed
data
• Promotes integrated view of organization’s
operations
• Reduces the probability of inconsistent data

21
DBMS Manages Interaction

22
DBMS Manages Interaction

23
Historical Roots of Database: Files and File
Systems
• Traditionally, file system compose of a collection of
files folders each being properly tagged and kept in
filing cabinets.
• First applications focused on clerical tasks
• Requests for information quickly followed
• File systems developed to address needs
• Data organized according to expected use
• Data Processing (DP) specialists computerized manual
file systems
24
File Terminology
• Data
• Raw Facts
• Field
• Group of characters with specific meaning
• Record
• Logically connected fields that describe a
person, place, or thing
• File and file folder
• Collection of related records
25
record field

26
File System Critique
• Sample COBOL Data Entry Interfaces/Screens

27
File System Critique
• File System Data Management
• Requires extensive programming in third-
generation language (3GL): COBOL, Basic, and
Fortran (what must be done and how it is to
be done)
• Time consuming
• Depends on physically storing data
• Makes ad hoc queries impossible
• Make difficult to modify file system (each file
has its own system)
• Leads to islands of information
28
File System Critique (cont’d.)
• Data Dependence
• Change in file’s data characteristics requires
modification of data access programs(e.g. changing
field from integer to decimal)
• Makes file systems cumbersome from programming
and data management views
• Structural Dependence
• Change in file structure requires modification of related
programs (e.g. adding or deleting a field)

29
File System Critique (cont’d.)
• Field Definitions and Naming Conventions
• Flexible record definition anticipates reporting
requirements
• Selection of proper field names important
• Attention to length of field names
• Use of unique record identifiers (record ids)

• Data Redundancy
• Different and conflicting versions of same data (data
pooling is difficult)
• Results of uncontrolled data redundancy
• Data anomalies (abnormalities)
• Modification
• Insertion
• Deletion
• Data inconsistency
• Lack of data integrity
30
Database Systems
• The DBMS software together with the data itself.
Sometimes, the applications are also included.
• Provides advantages over file system management
approach
• Eliminates data inconsistency (lack of data integrity),
data anomalies, data dependency, and structural
dependency problems
• Stores data structures, relationships, and access paths

31
Database vs. File Systems

32
Database System Environment

33
Database System Environment

• Hardware
• System’s Physical devices
• Computers
• Peripherals
• Network

34
Database System Environment

• Software
• Operating system: manages hardware
components
• DBMS: manages database
• MS Access, SQL Server, Oracle, DB2
• Application and utility software: support
access and manipulate data
• Generate information for decision making
• Help to manage database system

35
Database System Environment
• Database Users may be divided into
• Those who actually use and control the
database content, and those who design,
develop and maintain database
applications (called “Actors on the
Scene”), and
• Those who design and develop the DBMS
software and related tools, and the
computer systems operators (called
“Workers Behind the Scene”).

36
Database System Environment
• People (five users) – “Actors on the Scene”
• System administrator: hardware system
support
• Database administrator: manage DBMS use
• Database designer: design database structure
• System analysts and programmers:
implement application programs
• End users: employees and management

37
Database System Environment
• Procedures
• Instructions and rules that govern the design
and use of the database system
• Data

38
Database System Types
• Single-user vs. Multiuser Database
(user number)
• Desktop database
• Workgroup database
• Enterprise database
• Centralized vs. Distributed
(location)
• Use
• Production or transactional
• Decision support or data warehouse
(obtain information) 39
DBMS Functions
• Objective: Guarantee the integrity and
consistency of data. It has several functions:
• Data dictionary management: (the definition
of the data elements and their relationships
are stored in a data dictionary). It remove
data and structure dependencies.
• Data storage management: structures
required for data storage
• Data transformation and presentation:
relieving us from the distinct ion between
logical data format and physical data format
• Security management

40
DBMS Functions cont’d..
• Backup and recovery management
• Multiuser access control (concurrency)
• Data integrity management
• Database access language and application
programming interfaces
• Query language (DDL and DML)
• Database communication interfaces

41
Database Models
• Definition: collection of logical constructs
used to represent data structure and
relationships within the database
• Conceptual models: logical nature of data
representation; it emphasizes on what entity is
presented; it is used for database design as
blueprint
• Implementation models: emphasis on how the
data are represented in the database

42
Database Models
• Conceptual models include
• Entity-relationship database model (ERDBD)
• Object-oriented model (OODBM)
• Implementation models include
• Hierarchical database model (HDBM)
• Network database model (NDBM)
• Relational database model (RDBM)
• Object-oriented database model (ODBM)

43
Database Models (cont’d.)
• Relationships in Conceptual Models
• One-to-one (1:1)
• One-to-many (1:M)
• Many-to-many (M:N)
• Implementation Database Models
• Hierarchical
• Network
• Relational
• Object-Oriented
44
Hierarchical Database Model
• Logically represented by an upside down tree
• Each parent can have many children (segment linkage)
• Each child has only one parent
• Logically represented by an upside down tree
• 1:M relationship

45
Hierarchical Database Model
• Hierarchical path (beginning from left on disk)
• Left-list hierarchical path, or preorder traversal, or
hierarchical sequence

Final assembly->Component A->Assembly A-> -> Part


A ->Part B -> Component B -> Component C –
Assembly B -> Part C ->Part D

• Re-list sequence, if the segment is frequently accessed


• Bank systems commonly use HDBM
• customer account can be subject to many transactions (1:M
relationship)
• Relationship is fixed (debiting and crediting)
• Frequently access large amount of transactions

46
Hierarchical Database Model
• Advantages
• Conceptual simplicity: relationship between layers is
logically simple; design process is simple
• Database security: enforced uniformly through the
system
• Data integrity
• Data independence: automatic cascading of data type
changes in database
• Efficiency in 1:M relationships and when uses require
large numbers of transactions

Dominant in 1970s , when we used mainframe system


with large databases

47
Hierarchical Database Model
• Disadvantages
• Complex implementation: physical data storage
characteristics; database design is complicated
• Difficult to manage
• Lack of standards
• Lacks structural independence: navigational system
• Applications programming and use complexity (pointer
based)
• Implementation limitations, i.e. especially it only
handles 1:M type of model

48
Network Database Model (NDBM)
• Each record can have multiple parents
• Called by Database Task Group (DBTG) to define
standards
• Three crucial database components
• Network schema: conceptual organization of the entire
database
• Database name, record type and components for record
• Subschema: portion of database as information for
application programs
• Database management language: defining data
characteristics and data structure
• Schema Data definition language (DDL): define schema
components
• Subschema Data definition language
• Data manipulating language: manipulate data content

49
Network Database Model
• Each record can have multiple parents
• Introduce set to describe relationship
• Each set has owner record and member record,
parallel to parent and child in HDM
• Member may have several owners
• One-ownership

50
Network Database Model
• Advantages
• Conceptual simplicity, just like HDM
• Handles more relationship types (but all 1:M
relationship)
• Data access flexibility
• Promotes database integrity
• Data independence
• Conformance to standards
• Disadvantages
• System complexity
• Lack of structural independence

51
Relational Database Model (RDBM)

• Let’s user or database designer to operate human


logical environment
• Perceived by user as a collection of tables for data
storage, while let RDBMS handles the physical
details.
• Tables are a series of row/column intersections
• Tables related by sharing common entity
characteristics
• It allows 1:1, 1:M, M:N relationships

52
Relational Database Model

53
Relational Database Model
• Advantages
• Structural independence: data access path is is
irrelevant to database design; change structure will
not affect the database
• Improved conceptual simplicity
• Easier database design, implementation,
management, and use
• Ad hoc query capability with SQL (4GL is added)
• Powerful database management system
• Disadvantages
• Substantial hardware and system software overhead
• Poor design and implementation is made easy
• May promote “islands of information” problems
54
Entity Relationship Database Model
(ERDBM)
• Complements the relational data model concepts
• ERDBM introduces a relational graphic representation
• ERDBM is based on several components
• Entity, tabled entity (in RDM)
• Entity and entity set, a collection of like entities
• Each entity has attributes to describe the entity, which
is similar to field in table
• Relationship and connection
• Represented in an entity relationship diagram (ERD):
Chen’s ERD model and Crow’s Foot ERD
• Based on entities, attributes, and relationships
55
Entity Relationship Database
Model connection

entity

relationship

56
57
Entity Relationship Database Model
• Advantages
• Exceptional conceptual simplicity
• Visual representation
• Effective communication tool
• Integrated with the relational database model
• Disadvantages
• Limited constraint representation
• Limited relationship representation (internal
relationship can not be depicted; multiple
relationships)
• No data manipulation language
• Loss of information content

58
Object-Oriented Database Model (OODBM)

• Semantic Data model (SDM)->Object-oriented Data


Model (OODM)
• Object-oriented concept:
• Objects or abstractions of real-world entities are
stored
• Attributes describe properties
• Collection of similar objects is a class, similar to entity set
but contains procedure methods
• Methods represent real world actions of classes
• Classes are organized in a class hierarchy
• Inheritance is the ability of object to inherit attributes and
methods of classes above it
59
Object-Oriented Database Model (OODBM)

• Contains implementation and procedure operation


information for more complicated data such as
graphics, video, and other metadata
• Support transaction and information
• Reusability
• Portable to powered computing system

60
Object-Oriented Database Model

61
OO Database Model
• Advantages
• Adds semantic content (gives data greater meaning)
• Visual presentation includes semantic content
• Database integrity
• Both structural and data independence
• Disadvantages
• Lack of OODM standards
• no generalized data manipulation language or access
method
• Complex navigational data access
• Steep learning curve
• challenging to design and implement properly
• High system overhead slows transactions

62
63
Types of Database Systems
• Many types of database systems exits developed
based on the database models discussed
• However, other types of databases exist that are
not based on these models.
• With the explosion of different types of data and data
needs, newer database systems have been developed
• Others exist base on the data architect

64
Types of Database Systems
• NoSQL
• means Non-SQL/Not Only SQL
• is a type of database that is used for storing a
wide range of data sets.
• is not a relational database as it stores data not
only in tabular form but in several different
ways.
• came into existence when the demand for
building modern applications increased.
• We can further divide a NoSQL database into the
following four types:

65
Types of Database Systems
• NoSQL has four types
• Key-value storage
• It is the simplest type of database storage where it
stores every single item as a key (or attribute name)
holding its value, together.
• Examples are Redis, Riak, Oracle NoSQL, and
Amazon SimpleDB.
• Document-oriented Database
• A type of database used to store data as JSON-like
document.
• It helps developers in storing data by using the
same document-model format as used in the
application code.
• Examples are MongoDB, CosmosDB, Amazon
DynamoDB, and Amazon DocumentDB 66
Types of Database Systems
• NoSQL has four types
• Graph Databases
• It is used for storing vast amounts of data in a
graph-like structure.
• Most commonly, social networking websites use
the graph database.
• Examples are Neo4j, OreintDB, ArangoDB, and
AllegroGraph
• Wide-column stores
• It is similar to the data represented in relational
databases.
• Here, data is stored in large columns together,
instead of storing in rows.
• Examples are Apache Cassandra, ScyllaDB, Apache
Hbase and Google BigTable. 67
Types of Database Systems
• Cloud Database
• is a type of database where data is stored in a virtual
environment and executes over the cloud
computing platform.
• a database built to run in a public or hybrid cloud
environment to help organize, store, and manage
data within an organization.
• provides users with various cloud computing
services (SaaS, PaaS, IaaS, etc.) for accessing the
database (DBaaS).
• There are numerous cloud platforms, but the best
options are:
• Amazon Web Services(AWS), Microsoft Azure, Kamatera,
PhonixNAP, ScienceSoft, Google Cloud SQL, etc. 68
Types of Database Systems
• Cloud Database
• SaaS – Software as a Service
• allows users to connect to and use cloud-based
applications like DBMS over the Internet.
• PaaS – Platform as a Service
• is a complete development and deployment
environment in the cloud, with resources for
developing apps from simple cloud-based apps to
sophisticated, enterprise applications.
• IaaS – Infrastructure as a Service
• is a cloud computing model that provides on-
demand access to computing resources such as
servers, storage, networking, and virtualization. 69
Types of Database Systems
• Characteristics of “Internet age” databases
• Flexible, efficient, and secure Internet access
• Easily used, developed, and supported
• Supports agility, scalability and reduced cost
• Supports complex data types and relationships
• Seamless interfaces with multiple data sources and
structures
• Simplicity of conceptual database model
• Many database design, implementation, and
application development tools
• Powerful DBMS GUI make DBA job easier

70

You might also like