Chapter 1 - Introduction To Database Notes
Chapter 1 - Introduction To Database Notes
UNIT-I
The Database Environment: Basic Concepts and Definitions: Data, Information, Metadata,
Database, DBMS. Traditional File Processing Systems, the Database Approach, Advantages
of Database Management System, Components of Database Environment, Types of
Databases, Risks and Costs of Database.
Data:
x Data is defined as collection of raw facts about a place, person, thing or object
involving in the transactions of an organization.
x Data can be represented in various forms like text, numbers, images, audio, video,
graphs, document files, etc.
x Data constitutes the building blocks of information.
x Data is one of the important assets of the modern business.
x Data becomes relevant based on the context.
Information
x Information can be defined as processed data that increases the knowledge of end user.
x Information is used to reveal the meaning of data.
x Good, accurate and timely information is used in decision making.
x The quality of data influences the quality of information.
x Information can be presented in the tabular form, bar graph or an image.
Metadata
x Metadata is a special data that describes the characteristics or properties of the
data.
x Metadata consists of name, data type, length, min, max, description, special
constraints.
x Metadata allows the database designers and users understand what data exists and
what data means.
x Metadata is generally stored in a repository.
DBMS:
x Database management system can be defined as reorganized collection of logically
related data and set of programs used for creating, storing, updating and retrieval of data
from the database.
x DBMS acts as a mediator between end-user and the database.
x Database management system (DBMS): can be defined as collection of programs
that manages database structure and controls access to data.
x DBMS enables data to be shared.
x DBMS integrates many users’ views of the data.
Data warehouse: An organisation often needs to build a separate database that contains
historical and summarized information. Such a database is usually called a data warehouse,
or in some cases a data mart.
Analysts need specialised decision support tools to query and analyse the database. One class
of tools used for this purpose is called on-line analytical processing tools (OLAP)
2|Page
Historical Roots:Files and File Systems
• File systems typically composed of collection of file folders, each tagged and kept in
cabinet
• Contents of each file folder are logically related
• Computerized file systems are software that manages data of the organization.
• Data processing (DP) specialist developed computerized file systems.
• Each file used its own application program to store, retrieve, and modify data
• Each file was owned by individual or department that commissioned its creation
b. Duplication of data:
Because applications are often developed independently in file processing systems,
unplanned duplicate data files are the rule rather than the exception. This duplication is
wasteful because it requires additional storage space and increased effort to keep all files up
to date. Duplicate data files often result in loss data integrity because the data formats may
be inconsistent or the data values may not agree. For example, the same data item may have
different names in different files.
3|Page
c. Limited data sharing:
With the traditional file processing approach, each application has its own private files and
users have little opportunity to share data outside their own applications. It is often
frustrating to managers to find that a requested report will require a major programming
effort to obtain data from several incompatible files in separate systems. Data are scattered
in various files, and the files may be in different formats. Writing new application program
to retrieve data was difficult.
Database Systems
• Database system consists of logically related data stored in a single logical data
repository.
• Database system may be physically distributed among multiple storage facilities
• DBMS eliminates most of file system’s problems.
• Current generation stores data structures, relationships between structures, and access
paths. Also defines, stores, and manages all access paths and components
4|Page
SALES
DEPARTMENT
metadata
DATABASE
ACCOUNTING APPLICATIONS DBMS
DEPARTMENT
PERSONNEL
DEPARTMENT
Data previously stored in separate files have been integrated into a single database structure.
The DBMS provides the interface between the various database applications for the
organizational users and the database. The database approach emphasises the integration
and sharing of data throughout the organisation.
CUSTOMER PRODUCT
ORDER ORDER
LINE
Pine Valley Furniture Company’s first step in converting to a database approach was to
develop a list of the high _level entities that support the business activities of the
organisation.
An entity is an object that is important to the business. Some of the high-level entities
identified at Pine Valley Furniture Company are the following:
5|Page
x CUSTOMER: People and Organisations that buy products from Pine Valley Furniture.
x ORDER: purchase of one or more products by a customer
x PRODUCT: The items Pine Valley Furniture Company makes and sells
x ORDER LINE: Details about each product sold on a particular customer order.
After these entities were identified and defined the company proceeded to develop an
enterprise data model. An enterprise data model is a graphical model that shows the high-
level entities for the organisation and the associations among those entities.
The results of the preliminary studies convinced management of their potential advantage
of the database approach.
Relational Databases:
The company decided to implement a modern relational database management system that
views all data in the form of tables. The four entities represented by enterprise data model
are converted into tables where each column of a table represents an attribute.
6|Page
By eliminating the data redundancy, the opportunity of reducing the inconsistency has
increased. For example, if a customer’s address is stored only once, we cannot have
disagreement on the stored values. We avoid the wasted storage space that results from
redundant data storage.
Enforcement of Standards:
When the database approach is implemented with full management support, the database
administration function should be granted single point of authority and responsibility for
establishing and enforcing the data standards. These standards include naming conventions,
data quality, uniform procedures for accessing, updating and protecting the data. The data
repository provides database administrators with powerful set of tools for developing and
enforcing these standards. The DBMS provides an easy to use query language that allows
users to get immediate response from their queries rather than having to use a specialist
"programmer" to write queries for them
The major components of a typical database environment and their relationships are
shown below:
7|Page
1. Computer-aided software engineering (CASE) tools: Automated tools are used to
design the database and application programs.
5. Application Programs: Computer programs that are used to create and maintain the
database and provide information to users.
6. User Interface: Languages, Menus and other facilities by which users interact with
various system components such as CASE tools, application programs, the DBMS, and
the repository.
8|Page
8. System Developers: persons such as system analysts and programmers who design
new application programs. System developers often use CASE tools for the
requirements analysis and program design.
9. End Users: Persons throughout the organization, who add, delete and modify data in
the database and who request or receive information from it. All the user interactions
with the database must be routed through the DBMS.
EVOLUTION OF DATABASES
1960’s
File processing systems are still dominant during this period. The first database management
systems were introduced during that decade and were used for large and complex ventures
such as “APPOLLO moon landing project”. The first efforts of standardization were taken up
with the formation of data base task group in the late 1960’s
--------------------
--------------------
--------------------
Traditional Files
1970’s:
During this decade the use of database management systems became a commercial reality.
The hierarchical and network database management systems were developed largely to
cope with increasingly complex data structures such as manufacturing the bills of materials
that was extremely difficult to manage with conventional file processing methods. The
network and hierarchical models are generally called as first generation DBMS.
Major Disadvantages:
9|Page
Hierarchical Network
1980’s:
To overcome the above limitations, E.F Codd and others developed the relational data model
during the 1970’s. The model was considered second generation DBMS, received wide
spread commercial acceptance and diffusion during the 1980’s. With the relational model,
all the data were represented in the form of tables. A relatively simple fourth generation
language called SQL (for Structured Query Language) is used for data retrieval.
=== ===
=== ===
===
Relational
1990’s:
This decade started the new era of computing, first with client/server computing, then
Internet applications became increasingly important. To cope with the increasingly complex
data, object oriented databases were introduced during the late 1980’s. Since organizations
must manage a vast amount of both structured and unstructured data, both the relational
and object-oriented databases are of great importance today.
==
==
Object-oriented Object-relational
10 | P a g e
2000 and Beyond:
1. The ability to manage increasingly complex data types. These types include
multidimensional data, which is already assumed of importance in data ware house
applications.
2. The continued development of ‘universal servers’ based on object-relational DBMS.
These are database servers that can manage a wide range of data types transparently
to users.
3. Fully distributed databases will become a reality as an organisation will be able to
physically distribute its databases to multiple locations and update them
automatically.
4. Content-addressable storage will become more popular. For example, a user can scan
a photograph and have the computer search for the closest match to that photo.
5. Database and other technologies, such as artificial intelligence and television like
information services will make database access much easier for untrained users.
TYPES OF DATABASES
11 | P a g e
Conversion Costs: The term legacy system is widely used to refer to older applications in
an organization that are based on the file processing and or older database technology. The
cost of the converting these older systems to modern systems in terms of dollars, time and
organizational commitment may often seem prohibitive to an organization.
Need for Explicit Backup and Recovery: A shared corporate database must be accurate
and available at all times. This requires that comprehensive procedures be developed and
used for providing backup copies of data and for restoring database when occurs.
12 | P a g e