Chapter 9
Chapter 9
Data Design
Learning outcomes
Explain basic data design concepts, including data structure, DBMS, and Relational database model
Understand normalisation
Identify the various codes to simply output, input and data formats
Leaving out data storage tools and techniques, data coding and data control measures (detail is in your DBA216D subject)
Data Design Concepts
◦ A system analysts must understand basic data design concepts (structure and Relational DB Model)
◦ A Data structure is a framework for organizing, storing and managing data.
◦ These are tables / files that contain the data
◦ There is a 2-system storage / file-oriented system / file processing, where data is stored in two separate systems that are not
connected. This causes issues like inconsistency and data redundancy that threatens the quality and integrity of the data.
◦ On the other hand, there is DBMS (Database Management System) that joins these files/tables through common fields, thus no
duplicates. Also known as a Relational database or model.
◦ DBMS is a collection of tools, features and interfaces that enables addition, update, management, access and analysis of data.
◦ Main ADVANTAGE is it offers timely, interactive, flexible data access
◦ Scalability – system can expand, modify, downsize easily to meet the needs.
◦ Economy of scale – better utilization of hardware (servers and network). When an organisation can handle efficiently high volume of processing then
its referred to as economy of scale.
◦ Stronger standards – it ensures data name, format and documentation are uniform.
◦ Better security – the DBA defines the authorisation, meaning different users have different levels of access
◦ Data Independence – DBA can change the structure of data without modifying the IS that use the data.
DBMS components
1. Interfaces for user, DB admin and related systems
Data manipulation language (DML) controls database operations like storing, retrieving, updating and deleting data (Oracle and DB2 uses DML). Whereas MS
Access will use a graphical environment for menu-driven commands.
USERS – query language- specifies a task but not how it will be done. It is written in English.
Query by example (QBE), user provide an example of an answer.
Structured query language (SQL) is where the client pc communicates with the server.
DB ADMIN is concern with data security, integrity, preventing unauthorised access, provide backup, recovery, audit trials, maintain DB and supporting the users.
DBMS supports the DBA to create, update data structures, collect and report patters and irregularities.
RELATED IS – support several related IS that provide input to and require specific data from DBMS. No human intervention required.
2. Schema
Schema is a database that includes all fields, tables and relationships
Subschema – view of database used by one or more systems/users. Only defines those portions of database that that particular system/user has access to. It is used to
restrict access.
You get:
My notes:
1–1
This is an entity: Single noun 1–M
M – N - here you will need a 3rd table called associate entity to
break this many to many
This is a relationship:
Verb Many includes all numbers from zero to multiple.
Cardinality describes the numeric relationship between 2 entities and shows how instance of 1 entity relate to another. Also
known as crow foot notation
1 or Many Zero or 1
Normalisation
Standard Notation – table structure, field and PK
Repeating groups - 1 or more fields that occur several times in a single record, each occurrence has a different value.
1st Normal form (NF) – no repeating groups, expand the PK to include PK of the repeating group (combination PK)
2nd Normal Form (NF) – 1st NF AND all fields that are NOT part of the PK are functionally dependent on the whole (combined) PK.
The fields in the table must be functional dependent on the PK, example date and order. Without order there cannot be a date.
Break the combined table into 2 or more tables, each with their respective PK. (FYI: …. Means there is more fields to come)
Now place other fields with its respective PK. When its done, remove the table that did not have additional fields.
3rd Normal Form (NF) – every nonkey field depends on the key. 1 st and 2nd . Remove non dependant on PK fields and put into another table.
Codes
Code is a set of letters or numbers that represents a data item. Used to simplify output, input and data formats.
Codes shorter than data, they safe storage space and costs, reduce data transmission time and decrease data entry time. Can also
conceal or reveal information.
Types of code:
Sequence – letters or numbers assigned in a specific order. Indicates order of entry. Example: McDonalds order number
Block sequence codes – use blocks of numbers for different classifications. Example: SYA216D (216)
Alphabetic code – uses alphabet to distinguish items based on category, abbreviation, or easy to remember/ mnemonic code.
Category – Two alphabet letters to identify group: example: HW for Hardware
Abbreviated – NY for New York
Mnemonic – easy to remember, example: PTA –Pretoria, JHB - Johannesburg
Significant digit code – Branch code of your bank account
Derivation code – combine data from different items attributes or characteristics.
Cipher code – keyword to encode a number. It’s a keyword that leads to a number.
Action code – what action to be taken with an associated item. Lockdown level 5
Designing a code
◦ Keep code concise – must not be too long
◦ Allow for expansion – allow for codes to grow
◦ Keep codes stable – constant changes can cause inconsistency and require updates
◦ Make codes unique – HW is it houseware or hardware?
◦ Use sortable codes
◦ Use simple structure - don’t use letters – numbers.
◦ Avoid confusion – O can be confused for zero 0
◦ Make codes meaningful -easy to remember, user-friendly, convenient, easy to interpret.
◦ Use code for a single purpose
◦ Keep codes consistent
Normalisation is discussed
The various codes to simply output, input and data formats are identified
Data storage tools and techniques, data coding and data control measures is left for the subject DBA216D as it is
dedicated to this information.