Lecture 1 IS099
Lecture 1 IS099
IS 099
Lecture 1: Database Concepts
As the name of course suggests, database management system (DBMS) has two parts: Database and
Management System.
To find out what database is we have to start from data, which is the basic building block of any DBMS.
In computerized information system, data is the basic resource of the organization. So, proper organization and
management for data is very essential for organization to run smoothly.
Data and Information
Data is the raw facts, figures or statistics that can be recorded and that have implicit meaning. For example,
consider the names, telephone numbers, and addresses of the people you know.
The representation of facts ,concepts or instruction in a formal manner, which is suitable for understanding and
processing.
In any organization, it requires accurate and reliable data for better decision making, ensuring privacy of data
and controlling data efficiently.
Data can be represented in alphabets(A-Z, a-z),in digits(0-9) and using special characters(+,-.#,$, etc.). Example: 1,
Ravi, 19 etc.
What is information?
Information is the processed data on which decisions and actions are based.
It can be defined as the organized and classified data to provide meaningful values. Example: “The age of Ravi is
19”
Data Vs information
What is the difference?
Data Information
-Data can be defined in many ways. -Information is data that have been
Information science defines data as organized and communicated in
unprocessed information. coherent and meaningful manner.
An alternative approach of data handling is a computerized way of dealing with the information. The computerized
approach could also be either decentralized or centralized base on where the data resides in the system.
File processing system
2. File Processing System (FPS)
After the introduction of computer for data processing to the business community, the need to use the device for
data storage and processing increase.
File based data handling approaches were an early attempt to computerize the manual filing system. There were,
and still are, several computer applications with file based processing used for the purpose of data handling.
File processing system cont
• In FPS, groups of records are stored in separate files. Each file is
Independent of another. Thus, this approach is the decentralized
computerized data handling method.
• Each file contained & processed information for one specific task.
• Files are designed by using programming languages.
• Programs are dependent on the files and vice-versa;
that is, when the physical format of the file is changed,
the program has also to be changed.
Example of FPS Scenario
Consider part of a savings-bank enterprise that keeps information about all customers and savings accounts.
To allow users to manipulate the information, the system has a number of application programs that manipulate the
files. System programmers wrote these application programs to meet the needs of the bank. New application
programs are added to the system as the need arises.
For example, suppose that the savings bank decides to offer checking accounts. As a result, the bank creates new
permanent files that contain information about all the checking accounts maintained in the bank, and it may have to
write new application programs to deal with situations that do not arise in savings accounts.
Thus, as time goes by, the system acquires more files and more application programs.
DISADVANTAGES OF FPS
Data redundancy and inconsistency: Data redundancy means duplication of data values i.e., the same
information may be written (duplicated) in several files. For example, the address and telephone number of a
particular customer may appear in a file that consists of savings-account records and in a file that consists of
checking-account records.
• This redundancy (duplication) makes wastage of time, money and storage space.
• It may lead to data inconsistency; that is different copies of the same data are not matching.
- For example, a changed customer address may be reflected in savings-account records but not elsewhere in the
system.
- Phone number of a customer is different in different files.
Disadvantages of FPS
Separation and isolation of data: Because data are scattered in different files, and the files are in different format.
That means, each program maintains its own set of data.
• Users of one program may be unaware of potentially useful data held by other programs. To make a decision , a
user might need data from two separate files.
• Writing new application program to retrieve/access data is difficult.
• Imagine the work involved if data from several files were needed.
Disadvantages of FPS
Difficulty data access: Suppose a bank officer needs to find out the names of all customers who live within a
particular postal-code area. The officer asks the data-processing department to generate such a list. Because the
designers of the original system did not anticipate this request, there is no application program on hand to meet it.
• There is, however, an application program to generate the list of all customers. The bank officer has now two
choices: either obtain the list of all customers and extract the needed information manually or ask a system
programmer to write the necessary application program.
Thus, the conventional file processing system do not allow needed data to be retrieved in a convenient and
efficient manner.
Disadvantages of FPS
Atomicity: It is difficult to ensure atomicity in a file processing system. A computer system, like any other
mechanical or electrical device, is subject to failure. In many applications, it is crucial that, if a failure occurs, the
data be restored to the consistent state that existed prior to the failure. For example, transfer of funds from one
account to another should either complete or not happen at all. Otherwise, it will result in an inconsistent database
state.
Concurrent access anomalies: For the sake of overall performance of the system and faster response, many
systems allow multiple users to update the data simultaneously. In the file processing system it is not possible to
access a same file for transaction at same time. In such an environment, interaction of concurrent updates may result
in inconsistent data. Consider bank account A, containing $500. If two customers withdraw funds (say $50 and $100
respectively) from account A at about the same time, the result of the concurrent executions may leave the account in
an incorrect (or inconsistent) state.
Disadvantages of FPS
Security problems: There is no security provided in file processing system to secure the data from unauthorized
user access. For example, in a university, payroll personnel need to see only that part of the database that has
financial information. They do not need access to information about academic records. But, since application
programs are added to the file-processing system in an ad hoc manner, enforcing such security constraints is difficult.
Integrity Problems: The data values stored in the database must satisfy some integrity constraints. For example,
the balance of a bank account may never fall below a certain amount (say $25). Developers enforce these constraints
in the system by adding appropriate code in the various application programs. How ever when new constraints are
added, it is difficult to change the programs to enforce them. The problem is compounded when constraints involve
several data items from different files.
Database Approach
3. Database Approach
In order to overcome the limitation of a file system, a new approach was required. Hence a database approach
emerged.
The initial attempts were to provide a centralized collection of data. A database has a self describing nature.
What is a database?
A database is an organized collection of logically related data of any organization stored in formatted way
with an implicit meaning.
• In a database, data is organized (stored) strictly in row and column (tabular) format. A database can be of
any size and varying complexity.
Types of databases
1. Relational databases
When new data is added, new records are inserted into existing tables or new tables are added. Relationships can
then be made between two or more tables.
A relational database works by linking information from multiple tables through the use of “keys.”
-A key is a unique identifier which can be assigned to a row of data contained within a table.
-This unique identifier, called a “primary key,” can then be included in a record located in another table when that
record has a relationship to the primary record in the main table.
-When this unique primary key is added to a record in another table, it is called a “foreign key” in the associated
table. The connection between the primary and foreign key then creates the “relationship” between records
contained across multiple tables.
Types of databases
1. Relational databases
A relational database typically stores information in tables containing specific pieces and types of data.
-For example, a shop could store details of their customers’ names and addresses in one table and details of their
orders in another.
This form of data storage is often called structured data.
In relational database design, the database usually contains tables consisting of columns and rows.
When new data is added, new records are inserted into existing tables or new tables are added. Relationships can
then be made between two or more tables.
Types of databases
2. Non-relational databases
Non-relational databases are different from traditional relational databases in that they do not use the tabular
schema of rows and columns. Instead, non-relational databases use a storage model that is optimized for the
specific requirements of the type of data being stored.
Place information in field categories that we create, so that information is available for sorting and disseminating
the way we need it.
The data in non-relational database, however, is limited to that program and cannot be extracted and applied to a
number of other software programs, or other database files within a school or administrative system.
Relational database basic terms
• Database can also be referred as collection of related relations. This is due to some common attributes existing in a
selected pair of tables. Because of these common attributes, data of two or more tables can be combined together to
find out the complete details of a student.
• Questions like “Which hostel does the youngest student live in?” can be answered now, although Age and Hostel
attributes are in different tables.
Some examples of database
Database applications
• Banking: For customer information, accounts, loans, and banking transactions (deposit and/or withdrawal).
• Credit card transactions: For purchases on credit cards and generation of monthly statements.
• Finance: For storing information about holdings, sales, and purchases of financial instruments such as stocks and
bonds; also for storing real-time market data to enable online trading by customers and automated trading by the
firm.
Database applications
• Airlines: For reservations and schedule information. Airlines were among the first to use databases in a
geographically distributed manner.
• Telecommunication: For keeping records of calls made, generating monthly bills, maintaining balances on prepaid
calling cards, and storing information about the communication networks.