Chapter 1 Introduction to Databases Final
Chapter 1 Introduction to Databases Final
Introduction to Databases
Instructor: Muhammad Younas
1
Data
• A collection of raw facts and figures is called data
• The word raw means that the:
• Facts have not yet been processed to get their exact meaning.
• Data is collected from different sources
• It is collected for different purposes
• Data may consist of:
• numbers,
• characters,
• symbols or pictures etc.
2
Examples of Data
• When students get admission in colleges or universities,
they have to fill out an admission form.
• The form consists of raw facts about the students
• These raw facts are:
• student's name, father name; address etc.
3
Examples of Data
• During census, Government of Pakistan collects the data of all citizens
• Government stores this data permanently to:
• use it for different purposes at different times e.g. future planning etc.
4
Types of Data
• Numeric Data
• Numeric data consists of numeric digits from 0 to 9 like 10, 245
or -5.
• The numeric type of data may either be positive or negative.
• Alphabetic Data
• Alphabetic data consists of alphabetic letters from A to Z, a to z
and blank spaces e.g.
• “IT Series", "Computer" and "Islam" etc.
• Alphanumeric Data
• Alphanumeric data consists of numeric digits (0 to 9), letters (A
to Z) and, all special characters like *,%, and @ etc. like "87o/o",
"$300" and "H#17".
5
Types of Data
• Image Data
• This type of data includes charts, graphs, pictures and
drawings.
• This form of data is more comprehensive. It can be
transmitted as a set of bits.
• Audio Data
• Sound is a representation of audio. Audio data includes
music, speech or any type of sound.
• Video Data
• Video is a set of fulI-motion images played at a high speed.
Video is used to display actions and movements.
6
Information
• Definition:
• The processed data is called information.
• Characteristics:
• Organized and processed form of data.
• More meaningful than raw data.
• Essential for decision-making.
• Data and Processing:
• Data serves as input, and information is the output of processing.
• Information can be reused as data in further processing
• Example:-
• Student Marks:
• Marks in different subjects are used as data to calculate total marks (information). This
information can then be further processed to find the average marks.
7
Examples of Information
• Example 1: Student Data
• In Colleges Admission forms store raw facts about students.
Processing the data can yield a list of students from a specific area
(e.g., Faisalabad).This list, a form of processed data, is information.
• Example 2: Census Data
• Stored census data can be processed to find statistics like the total
number of graduates or the literacy rate. Government uses this
processed information for important decisions, such as improving
literacy rates.
• Example 3: Product Survey Data
• Organizations can process survey data to understand customer
satisfaction levels. The information helps organizations improve
product quality based on customer feedback.
8
Difference Between Data and Information
Data Information
1. Data consists of unprocessed raw facts. 1. Information is the processed form of data.
2. Data is used as input in the computer. 2. Information is the output of the computer.
3. Data is not meaningful. 3. Information is meaningful.
4. Data is normally huge in volume. 4. Information is normally short in volume.
5. Data is the asset of organizations and is not
5. Information is normally available to people for sale.
available to people for sale.
6. Data is difficult or even impossible to reproduce. 6. Information is easier to reproduce if lost. For
For example, if census data is lost, it is almost example, if the list of illiterate citizens is lost, it can
impossible to reproduce. be reproduced from stored data.
7. Data is used rarely. 7. Information is used frequently.
8. Data is an independent entity. 8. Information depends on data.
9. Data is not used in decision-making. 9. Information is very important for decision-making.
9
Metadata
• Definition:
• Metadata is data about data.
• It describes the properties and characteristics of other data.
• Purpose and Characteristics:
• Details size, format, and other attributes of the data.
• Includes rules and constraints related to data.
• Example:
• When creating a table, metadata includes data type, size, format,
and constraints for each field.
• Metadata defines properties of data stored in the table, ensuring
data integrity.
10
Field Definitions for Student Table
Field Name Data Type Length Description Constraint
Roll No Integer 3 Roll No of the student Value from 1 to 100
Name Alphabetic 50 Name of the student
Address Alphanumeric 100 Address of the student
Email Alphanumeric 25 Email of the student Must contain @ and .
Phone Alphanumeric 25 Phone of the student
11
File Processing System
• Definition and Purpose
• Traditional or simple file processing is the first computer-
based method for handling business applications.
• Historically, organizations stored data in files on tape or disk
and managed it using file-processing systems.
• Structure of File Processing System
• Each department in an organization maintains its own set of
files, designed specifically for their applications.
• Records in one file are independent of records in other files,
with no direct relationships between them.
12
File Processing System
User User
Student Employee
Processing System Processing System
16
Disadvantages of File Processing System
• Security Problems
• File processing systems lack adequate security features to
control access.
• Different access rights (e.g., read-only, delete, modify) for
different users are not supported.
• Example:
• A data entry operator can only enter data, while higher
authorities (e.g., chairman) may require complete data
access.
17
Disadvantages of File Processing System
• Program Maintenance
• File processing systems require extensive maintenance.
• Maintenance often consumes a large portion of the budget,
making it difficult to develop new applications.
18
Database
• Definition:
• A database is an organized collection of related data that is
stored efficiently and compactly.
• Organized: Data is stored in a way that allows easy access and use.
• Related: Data in a database is usually about a specific topic.
• Efficient: Data can be searched quickly.
• Compact: Data occupies minimal storage space.
• Examples:
• A database for students stores data such as roll number, name, and
address.
• A database for employees stores data like employee ID, grade, and salary.
• Data Storage:
• All data in a database is arranged in tables.
19
Tables
• Definition:
• Tables are the fundamental objects in a database used to
store data.
• A table consists of rows and columns.
• It provides a convenient way to store and manipulate data.
20
Tables
• Rows / Record
• Rows are the horizontal part of the table. It is a collection of
related fields.
• For example:
• the above table has three rows.
• Each row contains a record of different person.
21
Tables
• Columns / Field
• Columns are the vertical part of the table.
• For example:
• all values in the above table under "Name" field make a column
22
Examples of Databases
• Phone Directory
• A simple database storing the phone numbers of different people. Organized
in such a way that allows easy searching for phone numbers.
• Library
• A library database stores records of books, library members, book issuance,
and recovery. Helps in searching for books and supports research work.
• Accounts
• Used for managing the financial system of an organization. Keeps records of
all financial transactions and can calculate annual profit, trial balances, and
ledgers.
• College
• A database used to manage records of students, fee transactions,
examination details, and other college-related data. Can also store student
attendance records.
23
Database Management System (DBMS)
• A DBMS is a collection of programs designed to create
and maintain a database.
• It provides various facilities such as:
• Defining the structure of the database, including
• specifying data types, formats, and constraints.
• Storing data on a controlled storage medium.
• Inserting, deleting, updating, and retrieving specific data to
generate reports, etc.
24
Components of Database Environment
• Repository
• A collection of data definitions, relationships, output styles,
and report formats. Contains metadata essential for
managing the database.
• Database Management System (DBMS)
• A software system used to create and maintain databases.
• Database
• An organized collection of related data, structured for easy
storage, manipulation, and retrieval.
• Typically created to store data about a particular topic.
25
Components of Database Environment
• Application Program
• A program used to send commands to the DBMS for database
manipulation.
• Users interact with the application program, which then communicates
with the DBMS.
• User Interface
• A visual environment for the user to communicate with the computer.
• Components of the interface include:
• Forms: Used for entering, retrieving, and updating data in the database.
• Menus: Provide a list of commands for performing various operations.
• Reports: Output generated by the system, formatted with graphs, charts, tables,
etc.
26
Components of Database Environment
• Data Administrators
• Responsible for overseeing the entire information system. They
manage database access and monitor its usage.
• System Analysts and Application Programmers
• System Analysts:
• Determine user requirements and develop transaction specifications.
• Application Programmers:
• Implement these specifications into programs.
• End User
• Interacts directly with the application.
• Responsible for inserting, deleting, updating data, and retrieving
information from the system.
27
Database Approach
• The database approach
• offers several advantages over the
• traditional file processing system.
• It emphasizes organizing data in a way that improves
• efficiency,
• consistency, and
• security.
28
Advantages of Database Approach
• Redundancy Control
• In a database, data appears only once and is not duplicated
across multiple files.
• For example, in a college database, student data is stored in a single
table, and other tables, like the "Marks" table, reference this student
data by using the student’s Roll No. This eliminates the need to store
duplicate information, making data management more efficient.
• Data Consistency
• Redundancy control ensures data consistency.
• When a data item is stored in only one place, any update to that data will
automatically be reflected everywhere it is used, thus maintaining
consistency across the system.
29
Advantages of Database Approach
• Consistency Constraints
• These are rules enforced on the database to ensure data integrity.
• For example, constraints can ensure that data falls within a specific range, or
that it adheres to a required format before being entered into the database.
• If data does not meet these constraints, it will not be accepted by the system.
• Data Atomicity
• Atomicity refers to the concept that transactions must be completed
fully or not at all.
• In the case of a money transfer between two accounts (Account A and Account
B), if the system fails after deducting the amount from Account A but before
adding it to Account B, the transaction would be rolled back to prevent
inconsistencies.
• This ensures that data remains correct and consistent, either completing the
entire process or leaving no trace of the transaction.
30
Advantages of Database Approach
• Data Security
• Data security protects the database from unauthorized
access.
• A DBMS provides various security measures, such as
password-based access control, ensuring that not all users
can access or modify all data.
• Different types of users (e.g., data entry operators, senior
officials) may be granted varying levels of access to ensure
sensitive information is protected.
31
Advantages of Database Approach
• Reduced Development Time
• A database organizes data more efficiently than a file processing
system, making it easier and faster to develop applications that
use this data.
• Many DBMS provide tools and features that streamline the
development process, thus reducing the overall time required for
application development.
• Compact Storage
• Database management systems store data in a more compact
and efficient manner compared to file systems.
• It requires less storage space, thus saving system storage
resources and preventing unnecessary memory wastage.
32
Advantages of Database Approach
• Easier Reporting
• Reports are essential for decision-making in organizations, and
databases make the process of creating reports much easier.
• The data in a database is well-organized and can be quickly
retrieved to generate different types of reports in the required
format.
• Data Sharing
• A developed database can be accessed and shared by multiple
users within an organization.
• Different applications can also share the same database,
avoiding the need to recreate data for each new application.
33
Advantages of Database Approach
• Increased Concurrency
• DBMSs are designed to handle concurrent data access, allowing
multiple users to access the same data without interference.
• This prevents issues like data corruption or loss of integrity when
users simultaneously access or modify data.
• Improved Backup and Recovery
• Unlike file-based systems, DBMSs often come with built-in
facilities for data backup and recovery.
• In case of system failures, the DBMS minimizes the amount of
lost processing, and data can be quickly restored from backup.
34
Advantages of Database Approach
• Data Independence
• The database approach provides data independence,
meaning data storage structures and operations can be
modified without affecting application programs.
• This flexibility ensures that changes to the data structure do
not require changes to the associated applications,
simplifying maintenance and updates.
35
Disadvantages of Database Approach
• High Cost of DBMS
• Database management systems (DBMS) are sophisticated and
large software packages, making them expensive to purchase
and implement.
• Organizations may incur high initial and ongoing costs for
licensing and support.
• Higher Hardware Cost
• DBMS requires more powerful hardware due to its complexity
and resource demands.
• Additional memory, storage, and processing power may be
needed to run the system efficiently, increasing hardware costs.
36
Disadvantages of Database Approach
• Higher Programming Cost
• DBMS software is complex, with many features that require
specialized knowledge to operate effectively.
• Organizations may need to hire skilled and experienced
database programmers, which can add extra costs due to their
expertise.
• High Conversion Cost
• Converting existing records from file-based systems to a
database system can be expensive and time-consuming.
• The conversion process may require significant restructuring of
data and changes in format to align with the database system.
37
Disadvantages of Database Approach
• More Chance of Failure
• In a DBMS, resources and components are centralized, which
increases the risk of failure.
• If any critical component fails, it can halt the entire system,
potentially leading to downtime and data access issues.
• Complexity & Performance
• DBMS are general-purpose software that must perform a variety
of tasks, adding complexity and making them harder to manage.
• In some specific applications, DBMS may perform less efficiently
than file processing systems, especially when the complexity is
unnecessary for the task.
38
Difference Between File and Database
Approach
39
Difference Between File and Database
Approach
40
Application Program and DBMS
Relationship
• Application Program:
• An application program is a software used to send commands
to a Database Management System (DBMS) for manipulating
the database.
• These commands are sent through a Graphical User Interface
(GUI) that allows users to interact with the DBMS.
• The application program acts as an intermediary between the
user and the database, ensuring the smooth execution of
database-related tasks.
• Examples of Application Programs:
• Developer 2000
• Power Builder
41
Relationship Between Application Program
and DBMS
• Front-End vs. Back-End:
• The application program is considered the front-end, while the database
managed by the DBMS is the back-end.
• The front-end (application program) provides a user-friendly interface, which
simplifies the process of interacting with the database for tasks like data
entry, updates, or retrieval.
• The back-end (DBMS) is responsible for data storage, retrieval, and
management.
• User Communication:
• The user interacts with the application program through its interface.
• The application program then communicates with the DBMS to retrieve or
modify data in the database.
• This layer of abstraction makes it easier for users to interact with complex
databases without needing to directly interact with the database's internal
structures.
42
Relationship Between Application Program
and DBMS
• Reports Generation:
• Application programs also play a key role in generating reports
from the database.
• These reports are critical for decision-making in organizations, as
they present the data in a structured and understandable format.
• The DBMS helps manage the data efficiently, while the
application program formats it into meaningful reports.
• Summary of the Relationship:
• Application Program (Front-End) → User Interface (easy
interaction with data)
• DBMS (Back-End) → Data Storage and Management (data
handling)
43
Range of Database Applications
• Databases are essential for:
• storing, organizing, and retrieving data efficiently across different
levels of an organization.
• The range of database applications can be classified into
the following categories:
• Personal Computer Databases:
• Purpose:
• Designed for single-user environments, typically on a stand-alone
desktop computer or laptop.
• Examples:
• A medical representative using a laptop to store customer records.
44
Range of Database Applications
• Key Decisions in Personal Computer Databases:
• Development or Purchase:
• Should the database be developed in-house or purchased?
• End-User or Professional Development:
• Should it be developed by the end-user or a professional in
information systems?
• Database Design:
• What data is required, and how should it be structured?
• Selection of DBMS:
• Which DBMS should be chosen for the application?
45
Range of Database Applications
• Synchronization:
• How will the personal database be synchronized with other
databases?
• Accuracy:
• How will data accuracy be maintained?
• Challenges:
• Limited Sharing:
• It's difficult to share data quickly across different locations (e.g.,
different sales representatives' laptops) which can delay decision-
making processes.
46
Range of Database Applications
• Workgroup Databases:
• Purpose: Designed to support small teams or workgroups
(typically fewer than 25 people) who collaborate on the
same project.
• Examples: A workgroup connected via a local area network (LAN),
where the database is stored on a central server.
• Key Decisions in Workgroup Databases:
• Database Optimization:
• How to optimize the database to meet the needs of different
members of the workgroup.
47
Range of Database Applications
• Concurrency Control:
• Ensuring that multiple users can update the database concurrently
without issues.
• Location of Operations:
• Deciding whether operations should occur on the server or
workstation.
• Challenges:
• Security and Integrity:
• Ensuring that data remains secure and consistent, especially when
multiple users update data simultaneously.
48
Range of Database Applications
• Department Databases:
• Purpose:
• Designed to support the functions of a specific department in an
organization (e.g., marketing, accounting, or production departments).
• Examples:
• A department managing its own database for records related to its daily
operations.
• Key Decisions in Department Databases:
• Database Design:
• How to design the database for efficient performance, especially to
handle a large number of users and transactions.
49
Range of Database Applications
• Security:
• Ensuring proper security to protect data from unauthorized access.
• Tools for Complex Environments:
• Providing tools to manage the database in a complex department
environment.
• Data Redundancy and Consistency:
• Ensuring data consistency across multiple departments, avoiding
unnecessary redundancy.
• Distributed Database Needs:
• Determining if a distributed database is necessary, especially for
geographically dispersed users.
50
Range of Database Applications
• Enterprise Databases:
• Purpose:
• Designed to support the operations of an entire organization or multiple
departments within the organization. These are more comprehensive and can
encompass data from various smaller databases like personal, workgroup, and
department databases.
• Example:
• A data warehouse is a prime example, which aggregates data from various
operational databases (personal, workgroup, and department databases).
• Key Decisions in Enterprise Databases:
• Data Distribution:
• How to distribute data across various locations within the organization.
• Standards Maintenance:
• Ensuring uniformity in data names, definitions, and formats across the
organization.
51
Range of Database Applications
• Challenges:
• Data Integration:
• Combining data from multiple sources (personal computers,
workgroup databases, and department databases) into a unified
data warehouse that can support enterprise-wide reporting and
analysis.
• Maintenance and Updates:
• Periodically updating the data warehouse with data extracted from
other databases within the enterprise.
52
Types of Database Users
• In a database environment,
• different users play distinct roles based on their interaction with the
database system. These users include:
• Application Programmers:
• Role: Application programmers are professionals who write computer
programs in high-level languages.
• These programs are used to interact with databases.
• Responsibilities:
• Design and develop application programs according to user
requirements.
• Work with system analysts to understand specifications and create
database interfaces for users.
• Write code to access, manipulate, and present data from the database.
53
Types of Database Users
• End Users:
• End users interact with the database through application
programs or direct queries.
• They are typically divided into two categories:
• Naive Users:
• Characteristics:
• These users have no technical knowledge of the database or DBMS.
• Role:
• They interact with the database through simple user interfaces provided by
application programs.
• They perform tasks like data entry, update, and retrieval using predefined
commands or menus.
• Example: A data entry operator who enters records using forms or
buttons without understanding how the database operates behind the
scenes.
54
Types of Database Users
• Sophisticated Users:
• Characteristics:
• These users are familiar with the database structure and DBMS features.
• Role:
• They can use more advanced tools, such as query languages (e.g., SQL),
to interact with the database directly.
• Some sophisticated users may even write application programs to meet
specific needs.
• Example:
• A business analyst who uses SQL queries to retrieve and
analyze data from the database.
55
Types of Database Users
• Database Administrator (DBA):
• Role:
• The DBA is the most technical user in the database environment. They
are responsible for overseeing the entire database system, ensuring its
efficient operation, security, and integrity.
• Responsibilities of a Database Administrator:
• Installation:
• Installing and configuring the DBMS software.
• Monitoring:
• Continuously monitoring the performance of the database system to
ensure optimal operation.
• Problem Resolution:
• Addressing and solving any issues that arise in the database system.
56
Types of Database Users
• User Management:
• Assigning permissions and access rights to various users, ensuring
that only authorized individuals can perform certain operations.
• Backup and Recovery:
• Regularly backing up the database to prevent data loss and
restoring the system in case of failure or crash.
• Database Maintenance:
• Designing, creating, and maintaining the database schema and
ensuring its integrity through checks and constraints.
57
History of Database Systems
• The history of database systems spans several decades of
development, with key milestones shaping the modern database
landscape.
• 1960s: The Emergence of Database Models
• Cost-Effective Computing:
• In the 1960s, computers became more affordable, and their storage capabilities
increased, enabling businesses to store and manage data more effectively.
• Data Models Developed:
• Network Model (CODASYL): This model used pointers for data access and
required users to understand the physical structure of the database to query
information.
• Hierarchical Model (IMS): Similar to the network model, data was organized in a
tree structure, and access involved navigating through these hierarchical
relationships.
58
History of Database Systems
• Major Commercial Success:
• The SABRE system (developed by IBM and American Airlines) was a
notable commercial success that utilized the network model.
• 1970-1972: The Relational Model
• E.F. Codd's Contribution:
• In 1970, E.F. Codd proposed the relational model in his seminal
paper, which disconnected the schema (logical organization) from
physical storage methods. This approach has since become the
standard for modern databases.
59
History of Database Systems
• 1970s: Prototypes for Relational Systems
• Relational Database Prototypes Developed (1974-77):
• Ingres (UCB): Developed at the University of California, Berkeley, Ingres
led to systems like Sybase, MS SQL Server, and others. It used the query
language QUEL.
• System R (IBM): Developed by IBM at its San Jose facility, System R
eventually led to SQL/DS, DB2, and Oracle, and used the query language
SEQUEL.
• 1976: The Entity-Relationship (ER) Model
• P. Chen's ER Model:
• In 1976, P. Chen introduced the Entity-Relationship (ER) model, which
helped designers focus on conceptual data modeling instead of the
logical table structure.
60
History of Database Systems
• Early 1980s: Commercialization of Relational Databases
• Relational Systems Take Off:
• The 1980s saw the commercialization of relational database systems,
marking a significant shift in the database landscape.
• Mid 1980s: Standardization of SQL
• SQL Becomes a Standard:
• By the mid-1980s, SQL (Structured Query Language) became the
standard for relational databases. IBM DB2 was launched, and the
importance of network and hierarchical models diminished.
• Emerging DB Products:
• Companies like RIM, RBASE 5000, and Paradox emerged, offering
database solutions.
61
History of Database Systems
• Early 1990s: Client-Server Architecture and ODBMS
• Client-Server Model:
• The 1990s saw the rise of the client-server model, where
applications and databases were separated. This became the norm
for many business applications.
• Object Database Management Systems (ODBMS):
• Work began on prototypes for ODBMS, which integrated object-
oriented programming with database management
62
History of Database Systems
• Mid 1990s: The Internet and Web-DB Integration
• Internet Access to Databases:
• With the appearance of the World Wide Web in the mid-1990s,
remote database access became feasible. The concept of Web/DB
integration began to grow.
• Web/Internet-Database Connectors:
• Tools like Active Server Pages, Java Servlets, JDBC, and ColdFusion
facilitated the connection of web applications to databases.
63
History of Database Systems
• Late 1990s: The Rise of OLTP and Open Source
• OLTP and OLAP Growth:
• Online Transaction Processing (OLTP) and Online Analytical
Processing (OLAP) matured, with many businesses adopting point-
of-sale (POS) systems.
• Open-Source Solutions:
• Open-source database systems like MySQL and PostgreSQL gained
popularity, contributing to the democratization of database
technologies.
64
Early 21st Century
• Interactive Applications and Mobile Databases
• Growth of Interactive Applications:
• In the early 2000s, interactive applications grew in importance,
especially with the proliferation of PDAs and other mobile devices.
• Dominance of Major DB Companies:
• IBM, Microsoft, and Oracle continued to
• dominate the large database market.
65
Future Trends
• Big Data:
• With the emergence of terabyte-scale systems, managing and analyzing large
volumes of data has become more complex. Fields like geology, national
security, and space exploration are developing large-scale databases.
• Data Mining and Warehousing:
• Techniques like data mining, data warehousing, and data marts are commonly
used for handling large data sets.
• XML and Java Integration:
• Technologies such as XML and Java are increasingly being integrated into
databases for handling structured and unstructured data.
• Mobile Databases:
• Mobile database applications are becoming more common, and distributed
transaction processing is gaining popularity for business planning.
66
Thank You
67