0% found this document useful (0 votes)
6 views52 pages

Topic 4. Database Concepts

A database is an organized collection of related data, typically managed by a database management system (DBMS), allowing for efficient access and updates. Unlike traditional file systems, databases minimize data redundancy and support concurrent access, ensuring data integrity and security. Key concepts include tables, relationships, and keys, with primary keys uniquely identifying records and foreign keys linking tables together.

Uploaded by

hafidhadam2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views52 pages

Topic 4. Database Concepts

A database is an organized collection of related data, typically managed by a database management system (DBMS), allowing for efficient access and updates. Unlike traditional file systems, databases minimize data redundancy and support concurrent access, ensuring data integrity and security. Key concepts include tables, relationships, and keys, with primary keys uniquely identifying records and foreign keys linking tables together.

Uploaded by

hafidhadam2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 52

Database Concepts

What is a Database?
A Database is a collection of logically related data organized in a way
that data can be easily accessed, managed and updated.
OR
A database is an organized collection of structured information, or data,
typically stored electronically in a computer system.
A database is usually controlled by a database management system
(DBMS). Together, the data and the DBMS, along with the applications
that are associated with them, are referred to as a database system,
often shortened to just database.
Data within the most common types of databases in
operation today is typically modeled in rows and columns
in a series of tables to make processing and data
querying efficient. The data can then be easily accessed,
managed, modified, updated, controlled, and organized.
Most databases use structured query language (SQL) for
writing and querying data.
What is Data?
Data is a collection of a distinct unit of information. This “data” is used
in a variety of forms of text, numbers, media and many more. Talking in
terms of computing. Data is basically information that can be translated
into a particular form for efficient movement and processing.
Example: Name, age, weight, height, etc.
Traditional File System
File processing systems was an early attempt to computerize the manual filing
system that we are all familiar with. A file system is a method for storing and
organizing computer files and the data they contain to make it easy to find and
access them. File systems may use a storage device such as a hard disk or CD-
ROM and involve maintaining the physical location of the files.
The manual filing system works well when the number of items to be stored is
small. It even works quite adequately when there are large numbers of items
and we have only to store and retrieve them. However, the manual filing system
breaks down when we have to cross-reference or process the information in
the files. For example, a typical real estate agent’s office might have a separate
file for each property for sale or rent, each potential buyer and renter, and each
member of staff.
Characteristics of File Processing System

I. It is a group of files storing data of an organization.


II.Each file is independent from one another.
III.
Each file is called a flat file.
IV.Each file contained and processed information for one specific
function, such as accounting or inventory.
V. As systems became more complex, file processing systems offered
little flexibility, presented many limitations, and were difficult to
maintain.
Limitations of the File Processing System I File-Based
Approach
I. Data redundancy
Often, within an organization, files and applications are created by
different programmers from various departments over long periods of
time. This can lead to data redundancy, a situation that occurs in a
database when a field needs to be updated in more than one table.
This practice can lead to several problems such as:
• Inconsistency in data format
• The same information being kept in several different places (files)
• Data inconsistency, a situation where various copies of the same data
are conflicting, wastes storage space and duplicates effort
Limitations of the File Processing
System I File-Based Approach cont.…
II. Data isolation
Data isolation is a property that determines when and how changes
made by one operation become visible to other concurrent users and
systems. This issue occurs in a concurrency situation. This is a problem
because:
• It is difficult for new applications to retrieve the appropriate data,
which might be stored in various files.
Limitations of the File Processing
System I File-Based Approach cont.…
III. Integrity problems
Problems with data integrity is another disadvantage of using a file-
based system. It refers to the maintenance and assurance that the data
in a database are correct and consistent. Factors to consider when
addressing this issue are:
• Data values must satisfy certain consistency constraints that are
specified in the application programs.
• It is difficult to make changes to the application programs in order to
enforce new constraints.
Limitations of the File Processing
System I File-Based Approach cont.…
IV. Concurrency access
Concurrency is the ability of the database to allow multiple users access
to the same record without adversely affecting transaction processing.
A file-based system must manage, or prevent, concurrency by the
application programs. Typically, in a file-based system, when an
application opens a file, that file is locked. This means that no one else
has access to the file at the same time.
In database systems, concurrency is managed thus allowing multiple
users access to the same record. This is an important difference
between database and file-based systems.
Limitations of the File Processing
System I File-Based Approach cont.…
V. Security problems
Security can be a problem with a file-based approach because:
• There are constraints regarding accessing privileges.
• Application requirements are added to the system in an ad-hoc manner so
it is difficult to enforce constraints.
• For Example: Consider the Banking System. The Customer Transaction file
has details about the total available balance of all customers. A Customer
wants information about his account balance. In a file system it is difficult
to give the Customer access to only his data in the· file. Thus enforcing
security constraints for the entire file or for certain data items are difficult.
Database Approach
In order to remove all limitations of the File Based Approach, a new approach
was required that must be more effective known as Database approach.
The Database is a shared collection of logically related data, designed to meet
the information needs of an organization. A database is a computer based record
keeping system whose over all purpose is to record and maintains information.
The database is a single, large repository of data, which can be used
simultaneously by many departments and users. Instead of disconnected files
with redundant data, all data items are integrated with a minimum amount of
duplication.
The database is no longer owned by one department but is a shared corporate
resource. The database holds not only the organization’s operational data but
also a description of this data. For this reason, a database is also defined as a self-
describing collection of integrated records. The description of the data is known
as the Data Dictionary or Meta Data (the ‘data about data’). It is the self-
describing nature of a database that provides program-data independence.
Database Approach Cont.…
Database Approach Cont.…
The intent of a database is that a collection of data should serve as many
applications as possible. Therefore, a database is often thought of as a
repository of information needed to run certain functions in a corporation or
organization. It would permit only the retrieval of data but also the
continuous modification of data needed for the control of operations. It may
be possible to search the database to obtain answers to questions or
information for planning purposes.
In a typical file-processing system, permanent records are stored in different
files. Many different application programs are written to extract the records
and add the records to the appropriate files. But this scheme has several
major limitations and disadvantages, such as data redundancy (duplication of
data), data inconsistency, maladaptive data, non-standard data, insecure
data, incorrect data, etc. A database management system is an answer to all
these problems as it provides centralized control of the data.
Database Abstraction
A major purpose of a database is to provide the user with only as much
information as is required of them. This means that the system does
not disclose all the details of the data, rather it hides some details of
how the data is stored and maintained. The complexity of databases is
hidden from them which, if necessary, are ordered through multiple
levels of abstraction to facilitate their interaction with the system.
The different levels of the database are implemented through three
layers:
Database Abstraction Cont.…
1.Internal Level(Physical Level): The lowest level of abstraction, the
internal level, is closest to physical storage. It describes how the data is
stored concretely on the storage medium.
2.Conceptual Level: This level of abstraction describes what data is
concretely stored in the database. It also describes the relationships
that exist between the data. At this level, databases are described
logically in terms of simple data structures. Users at this level are not
concerned with how these logical data structures will be implemented
at the physical level.
3.External Level(View Level): It is the level closest to users and is related
to the way the data is viewed by individual users.
Database Abstraction Cont.…
What is organized data?
Data organization is the practice of categorizing and classifying data to
make it more usable. Similar to a file folder, where we keep important
documents, you'll need to arrange your data in the most logical and
orderly fashion, so you and anyone else who accesses it can easily find
what they're looking for.
How is organized data stored efficiently?
The most efficient way to store data is with the help of a relational
database.
A relational database is a collection of data items with pre-defined
relationships between them. These items are organized as a set of tables
with columns and rows. Tables are used to hold information about the
objects to be represented in the database.
A relational database consists of 3 high-level components:
I. Tables
II. Relationships
III. Keys
How is organized data stored efficiently?
Cont.…
TABLES.
Tables are the Microsoft Excel equivalent of a single spreadsheet. They can
also be classified as standalone datasets. Tables are used to organize the
most closely related data together. A very basic example of a table could be a
dataset about people that contains a bunch of people’s names, job titles,
manager numbers, hiring dates, salaries, and commissions.
This information would be stored in a column and row format. Rows and
columns also happen to be the very foundation of a table.
Where columns are used to store different information about one person,
rows store information about different people. With both of them paired
together, it ends up becoming a table full of information
How is organized data stored efficiently? Cont.…
Columns.
Columns are used to differentiate the information we have on a single observable entity. In
a Table that contains information about people, the columns would be used to hold
different information. If a Table, as mentioned above, contains people’s names, job titles,
manager numbers, hiring dates, salaries, and commissions, then that table will have 6
columns plus a Primary Key column that we will discuss in later sections.
Each column can be set up to allow only a specific type of information to be entered into it.
This aspect allows for much-needed data integrity. For example, a column about salary
should only contain numbers, right? While that is true, the people operating the databases
are humans and can therefore accidentally enter something else in it. To prevent this from
happening, columns can be designed to only let a specific type of information to be entered.
The same goes for an email column. Anything that does not end in the typical ‘@abc.com’
should not be allowed inside that column.
The customization that goes into a column is pretty much endless. There are many presets
available and custom options too.
How is organized data stored efficiently?
Cont.…
How is organized data stored efficiently?
Cont.…
Rows.
Rows of a table represent the number of observable entities we are
looking at. To put it simply, if the people table has 3 rows, it means it
has the data of 3 different people. Each row represents an individual
person and the columns will display their respective information.
Rows allow us to see individual entries in the table. Each row also
contains a Primary Key that allows us to search for individual entries
with ease.
How is organized data stored efficiently?
Cont.…
How is organized data stored efficiently?
Cont.…
Keys
Keys allow unique identification for all rows in the table. Without keys there
would be no way to differentiate between entries that have identical
information in their columns. Two people in a table can have the same names
and birthdays and without a unique key, it will be hard to differentiate
between them and can lead to unnecessary confusion.
Suppose you’re an HR person who has to send a termination letter to a guy
named John Doe and a promotion letter to another person with the same
name. Imagine if that gets mixed up, both receive the termination or
promotion letter. Talk about a corporate nightmare, right?
There are two types of keys you should know: a primary key and a foreign key.
How is organized data stored efficiently?
Cont.…
PRIMARY KEY
A primary key, also called a primary keyword, is a key in a relational database that
is unique for each record. It is a unique identifier, such as a driver license number,
telephone number (including area code), or vehicle identification number (VIN).
A relational database must always have one and only one primary key. Primary
keys typically appear as columns in relational database tables.
The choice of a primary key in a relational database often depends on the
preference of the administrator. It is possible to change the primary key for a
given database when the specific needs of the users changes. For example, the
people in a town might be uniquely identified according to their driver license
numbers in one application, but in another situation it might be more convenient
to identify them according to their telephone numbers.
How is organized data stored efficiently?
Cont.…
FOREIGN KEY
A foreign key is a column or columns of data in one table that connects
to the primary key data in the original table.
To ensure the links between foreign key and primary key tables aren't
broken, foreign key constraints can be created to prevent actions that
would damage the links between tables and prevent erroneous data
from being added to the foreign key column.
How is organized data stored efficiently?
Cont.…
Difference between Primary Key and Foreign Key
I. Basics of Primary Key vs Foreign Key
A primary key is a special key in a relational database that acts as a
unique identifier for each record meaning it uniquely identifies each
row/record in a table and its value should be unique for each row of
the table. A foreign key, on the other hand, is a field in one table that
link two tables together. It refers to a column or a group of columns
that uniquely identifies a row of another table or same table.
How is organized data stored efficiently?
Cont.…
II. Relation of Primary Key vs Foreign Key
A primary key uniquely identifies a record in the relational database
table, whereas a foreign key refers to the field in a table which is the
primary key of another table. A primary key must be unique and only
one primary key is allowed in a table which must be defined, whereas
more than one foreign key are allowed in a table.
How is organized data stored efficiently?
Cont.…
III. Duplicate Values of Primary Key vs Foreign Key
A primary key is a combination of UNIQUE and Not Null constraints so
no duplicate values can be allowed to have in a primary key field in a
relational database table. No two rows are allowed to carry duplicate
values for a primary key attribute. Unlike a primary key, foreign key can
contain duplicate values and a table in a relational database can contain
more than foreign key.
How is organized data stored efficiently?
Cont.…
IV. NULL of Primary Key vs Foreign Key
One of the main differences between the two is that unlike primary
keys, foreign keys can also contain NULL values. A table in a relational
database can have only one primary key which does not allow NULL
values.

V. Temporary Table of Primary Key vs Foreign Key


A primary key constraint can be defined implicitly on temporary tables
and their variables, whereas a foreign key constraint cannot be
enforced on local or global temporary tables.
How is organized data stored efficiently?
Cont.…
VI. Deletion of Primary Key vs Foreign Key
A primary key value cannot be deleted from the parent table which is
referred to as a foreign key in the child table. You have to delete the
child table first before removing the parent table. On the contrary, a
foreign key value can be deleted from the child table even if the value is
referred to the primary key of the parent table.
How is organized data stored efficiently?
Cont.…
How is organized data stored efficiently?
Cont.…
NB:
Keys play a crucial role in the existence of database schema to establish links
between tables and within a table. Keys establish relationships and enforce
different types of integrity, especially table-level and relationship-level integrity.
For one, they make sure table contains unique records and the fields you use to
establish a relationship between tables must contain matching values. Primary
key and foreign key are the two most important and common types of keys used
in relational databases. A primary key is a special key used to uniquely identify
records in a table, whereas a foreign key is used to establish relationship
between two tables. Both are identical in structure but play different roles in
relational database schema.
How is organized data stored efficiently?
Cont.…
Why use Primary Key?
I. Here are the cons/benefits of using primary key.
II. The main aim of the primary key is to identify each and every record in the
database table.
III. You can use a primary key when you do not allow someone to enter null values.
IV. If you delete or update records, the action you specified will be undertaken to
make sure data integrity.
V. Perform restrict operation to rejects delete or update operation for the parent
table.
VI. Data are organized in a sequence of clustered index whenever you physically
organize DBMS table.
How is organized data stored efficiently?
Cont.…
Why use Foreign Key?
I. Here are the important reasons of using foreign key.
II. Foreign keys help you to migrate entities using a primary key from the
parent table.
III. A foreign key enables you to link two or more tables together.
IV. It makes your database data consistent.
V. A foreign key can be used to match a column or combination of
columns with primary key in a parent table.
VI. SQL foreign key constraint is used to make sure the referential
integrity of the data parent to match values in the child table.
How is organized data stored efficiently? Cont.…
RELATIONSHIP
Relationships are meaningful associations between tables that contain
related information they’re what make databases useful. Without some
connection between tables in a database, you may as well be working
with disparate spreadsheet files rather than a database system.
Every table contains a field known as an entity (or primary) key, which
identifies the rows within that table. By telling your database that the
key values in one table correspond to key values in another, you create
a relationship between those tables; these relationships make it
possible to run powerful queries across different tables in your
database. When one table’s entity key gets linked to a second table, it’s
known as a foreign key in that second table.
How is organized data stored efficiently? Cont.…

Identifying the connections you’ll need between tables is part of the


data modeling and schema design process that is, the process of
figuring out how your data fits together, and how exactly you should
configure your tables and their fields. This process often involves
creating a visual representation of tables and their relationships, known
an entity relationship diagram (ERD), with different notations specifying
the kinds of relationships.
How is organized data stored efficiently?
Cont.…
Types of Database Relationships.
I. One to One
This type of relationship allows only one record on each side of the
relationship. The primary key relates to only one record (or none) in
another table. A one-to-one (1:1) relationship means that each record
in Table A relates to one, and only one, record in Table B, and each
record in Table B relates to one, and only one, record in Table A. Look at
the following example of tables from a company's Employees database:
How is organized data stored efficiently? Cont.…
PERSONAL
EmployeeID FirstName LastName Address City State Zip

EN1-10 Carol Schaaf 2306 Palisade Ave. Union City NJ 07087

EN1-12 Gayle Murray 1855 Broadway New York NY 12390

EN1-15 Steve Baranco 742 Forrest St. Kearny NJ 07032

EN1-16 Kristine Racich 416 Bloomfield St. Hoboken NJ 07030

EN1-19 Barbara Zumbo 24 Central Ave. Ritchfield Park NJ 07660

EN1-20 Daniel Gordon 2 Angelique St. Weehawken NJ 07087

EN1-22 Jacqueline Rivet 3600 Bergeline Ave. Union City NJ 07087

EN1-23 Betsy Rosyln 1800 Boulevard East Weehawken NJ 07086

EN1-25 Will Strick 2100 91st St. North Bergen NJ 07047

EN1-26 Susan Shipe 240 Fifth Ave. New York NY 10018


How is organized data stored efficiently?
Cont.…
PAYROLL

EmployeeID PayRate

EN1-10 $25.00

EN1-12 $27.50

EN1-15 $20.00

EN1-16 $19.00

EN1-19 $22.75

EN1-20 $23.00

EN1-22 $22.50

EN1-23 $19.50

EN1-25 $12.50

EN1-26 $14.00
How is organized data stored efficiently?
Cont.…
Above, tables with a one-to-one relationship from a database of
information about employees
Each record in the Personal table is about one employee. That record
relates to one, and only one, record in the Payroll table. Each record in
the Payroll table relates to one, and only one, record in the Personal
table. (This is what looking at it from both directions means.)
In a one-to-one relationship, either table can be considered to be the
primary or parent table.
How is organized data stored efficiently?
Cont.…
II. ONE TO MANY
A one-to-many relationship allows a single record in one table
to be related to multiple records in another table. One-to-
many relationships are the most common type of relationships between tables
in a database. In a one-to-many (sometimes called many-to-one) relationship, a
record in one table corresponds to zero, one, or many records in another table.
For example a record in Table A can relate to zero, one, or many records in
Table B. Many records in Table B can relate to one record in Table A.
The potential relationship is what's important; for a single record in Table A,
there might be no related records in Table B, or there might be only one related
record, but there could be many. Look at the following tables about a
company's Customers and Orders.
How is organized data stored efficiently? Cont.…
CUSTOMERS

CustomerID CustomerName Address City State Zip

20151 Engel's Books 19 International Dr Ryebrook NY 10273-9764

20493 Jamison Books 396 Apache Ave Fountain Valley CA 92708-4982

20512 Gardening Galore 79 Gessner Pk Houston TX 77024-6261

20688 Books Abound 51 Ulster St Denver CO 80237-3386

20784 Book World 687 Mountain Rd Stowe VT 08276-3196

20926 The Corner Booksotre 36 N.Miller Ave Syracuse NY 13206-4976

20932 Allendale Books 512 Columbia Rd Someville NJ 08876-2987

21570 In Between the Covers 2008 Delta Ave Cincinnati OH 45208-4468

21587 Books and Beyond 51 Windsor St Cambridge MA 02139-2123

21965 Cover to Cover 12 Harbor St Burlington VT 04982-2977


How is organized data stored efficiently? Cont.…
ORDERS

OrderNum CustomerID OrderDate ShipDate Shipper

76654 20151 2/1/00 2/6/00 USPS

74432 20151 6/30/99 7/2/99 Federal Express

75987 20151 11/10/99 11/12/99 UPS

62922 20493 9/5/99 9/6/99 UPS

65745 20493 10/1/99 10/3/99 USPS

72212 20493 4/22/00 4/25/00 UPS

73547 20493 8/17/99 8/20/99 UPS

69211 21570 5/12/99 5/12/99 Federal Express

70343 21587 10/2/00 10/4/00 UPS

72833 21587 12/14/99 12/17/99 UPS


How is organized data stored efficiently? Cont.…
Above, tables with data about customers and orders that have a one-
to-many relationship
The Customers table holds a unique record for each customer. Each
customer can (and, we hope, does) place many orders. Many records in
the Orders table can relate to only one record in the Customers table.
This is a one-to-many relationship (1:N) between the Customers table
and the Orders table.
In a one-to-many relationship, the table on the one side of the
relationship is the primary table and the table on the many side is the
related table.
How is organized data stored efficiently? Cont.…
III. MANY TO MANY
This is a complex relationship in which many records in a table can link
to many records in another table. It is many to many relationships that
create a relationship between two tables. Each record of the first table
can relate to any records (or no records) in the second table. Similarly,
each record of the second table can also relate to more than one
record of the first table. It is also represented an N:N relationship.
How is organized data stored efficiently? Cont.…

EMPLOYEES

EmployeeID Last Name First Name ProjectNum

EN1-26 O'Brien Sean 30-452-T3

EN1-26 O'Brien Sean 30-457-T3

EN1-26 O'Brien Sean 31-124-T3

EN1-33 Guya Amy 30-452-T3

EN1-33 Guya Amy 30-482-TC

EN1-33 Guya Amy 31-124-T3

EN1-35 Baranco Steven 30-452-T3

EN1-35 Baranco Steven 31-238-TC

EN1-36 Roslyn Elizabeth 35-152-TC

EN1-38 Schaaf Carol 36-272-TC

EN1-40 Wing Alexandra 31-238-TC

EN1-40 Wing Alexandra 31-241-TC


How is organized data stored efficiently?
Cont.…
PROJECTS
ProjectNum ProjectTitle EmployeeID
30-452-T3 Woodworking Around The House EN1-26
30-452-T3 Woodworking Around The House EN1-33
30-452-T3 Woodworking Around The House EN1-35
30-457-T3 Basic Home Electronics EN1-26
30-482-TC The Complete American Auto Repair Guide EN1-33
31-124-T3 The Sport Of Hang Gliding EN1-26
31-124-T3 The Sport Of Hang Gliding EN1-33
31-238-TC The Complete Baseball Reference EN1-35
31-238-TC The Complete Baseball Reference EN1-35
31-241-TC Improving Your Tennis Game EN1-40
35-152-TC Managing Your Personal Finances EN1-36
36-272-TC Using Electronic Mail Effectively EN1-38
How is organized data stored efficiently?
Cont.…
Examine the sample data above. These tables hold data about
employees and the projects to which they are assigned. Each project
can involve more than one employee and each employee can be
working on more than one project (the "do more with less" thing). This
constitutes a many-to-many (N:N) relationship.
The Above, tables with a many-to-many relationship Most RDBMSs do
not support many-to-many relationships.
How is organized data stored efficiently? Cont.…
The usual solution is to break this relationship down into two one-to-
many relationships by creating an intersection or junction table. This
table would hold the primary key field from each of the tables in the
many-to-many relationship. In the new table, those fields together
would be a multi-field primary key resulting in the following
relationships and diagram.
This forms two one-to-many relationships; each employee can work on
many projects and many employees can work on a single project

You might also like