0% found this document useful (0 votes)
145 views

Chapter 1 - Introduction To Database Notes

The document discusses the database environment and the advantages of database management systems over traditional file processing systems. It defines key concepts like data, information, metadata, databases, and DBMS. It describes the components of the database environment and risks of databases. It then summarizes some disadvantages of file processing systems like program-data dependence, data duplication, limited data sharing, lengthy development times, and excessive maintenance. It introduces database systems and how they address these issues through data integration and a centralized DBMS interface.

Uploaded by

Surendra
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
145 views

Chapter 1 - Introduction To Database Notes

The document discusses the database environment and the advantages of database management systems over traditional file processing systems. It defines key concepts like data, information, metadata, databases, and DBMS. It describes the components of the database environment and risks of databases. It then summarizes some disadvantages of file processing systems like program-data dependence, data duplication, limited data sharing, lengthy development times, and excessive maintenance. It introduces database systems and how they address these issues through data integration and a centralized DBMS interface.

Uploaded by

Surendra
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

DATABASE MANAGEMENT SYSTEM

B.Tech (Edu) - II Semester

UNIT-I

The Database Environment: Basic Concepts and Definitions: Data, Information, Metadata,
Database, DBMS. Traditional File Processing Systems, the Database Approach, Advantages
of Database Management System, Components of Database Environment, Types of
Databases, Risks and Costs of Database.

THE DATABASE ENVIRONMENT

Data:
x Data is defined as collection of raw facts about a place, person, thing or object
involving in the transactions of an organization.
x Data can be represented in various forms like text, numbers, images, audio, video,
graphs, document files, etc.
x Data constitutes the building blocks of information.
x Data is one of the important assets of the modern business.
x Data becomes relevant based on the context.

Information

x Information can be defined as processed data that increases the knowledge of end user.
x Information is used to reveal the meaning of data.
x Good, accurate and timely information is used in decision making.
x The quality of data influences the quality of information.
x Information can be presented in the tabular form, bar graph or an image.

Metadata
x Metadata is a special data that describes the characteristics or properties of the
data.
x Metadata consists of name, data type, length, min, max, description, special
constraints.
x Metadata allows the database designers and users understand what data exists and
what data means.
x Metadata is generally stored in a repository.

Example for Metadata:


Name Type Length Description
Course alphanumeric 30 course name
Section integer 01 section number
Semester alphanumeric 10 Semester and year
1|Page
Database :

x Database can be defined as organized collection of logically related data.


x Database can be of any size and complexity.
x Data are structured so as to be easily stored, manipulated, and retrieved by users.
x Example: Sales person can store customers contacts on his laptop that consist of
few mega bytes of data or A big company can store the data of all activities in the
organization which helps in decision making..

DBMS:
x Database management system can be defined as reorganized collection of logically
related data and set of programs used for creating, storing, updating and retrieval of data
from the database.
x DBMS acts as a mediator between end-user and the database.
x Database management system (DBMS): can be defined as collection of programs
that manages database structure and controls access to data.
x DBMS enables data to be shared.
x DBMS integrates many users’ views of the data.

Repository Vs Database: A repository is a centralized storehouse for all data definitions,


data relationships, and other system components, while a database is an organized
collection of logically related data.

Data warehouse: An organisation often needs to build a separate database that contains
historical and summarized information. Such a database is usually called a data warehouse,
or in some cases a data mart.
Analysts need specialised decision support tools to query and analyse the database. One class
of tools used for this purpose is called on-line analytical processing tools (OLAP)

2|Page
Historical Roots:Files and File Systems

• File systems typically composed of collection of file folders, each tagged and kept in
cabinet
• Contents of each file folder are logically related
• Computerized file systems are software that manages data of the organization.
• Data processing (DP) specialist developed computerized file systems.
• Each file used its own application program to store, retrieve, and modify data
• Each file was owned by individual or department that commissioned its creation

Disadvantages of file processing systems

a. Program - Data dependence:


File descriptions are stored within each application program that accesses a given file. As a
consequence, any change to a file structure requires changes to the file descriptions for all
programs that access the file.
Suppose it is decided to change the customer address field length in the records in a file from
30 to 40 characters. The file descriptions in each program that is affected would have to be
modified. It is often difficult to locate all programs affected by such changes.

b. Duplication of data:
Because applications are often developed independently in file processing systems,
unplanned duplicate data files are the rule rather than the exception. This duplication is
wasteful because it requires additional storage space and increased effort to keep all files up
to date. Duplicate data files often result in loss data integrity because the data formats may
be inconsistent or the data values may not agree. For example, the same data item may have
different names in different files.

3|Page
c. Limited data sharing:
With the traditional file processing approach, each application has its own private files and
users have little opportunity to share data outside their own applications. It is often
frustrating to managers to find that a requested report will require a major programming
effort to obtain data from several incompatible files in separate systems. Data are scattered
in various files, and the files may be in different formats. Writing new application program
to retrieve data was difficult.

d. Lengthy development times:


With the traditional file processing approach, there is little opportunity to leverage the
previous development efforts. Each new application requires that the developer essentially
start from scratch by designing new file formats and descriptions. The lengthy development
times required are often inconsistent with today’s fast-paced business environment.

e. Excessive program maintenance:


The preceding factors all combine together to create a heavy program maintenance load in
organizations that rely on traditional file processing systems. As much as 80% of the total
information systems development budget may be devoted to program maintenance in such
organizations. This leaves little opportunity for developing new applications.

Database Systems

• Database system consists of logically related data stored in a single logical data
repository.
• Database system may be physically distributed among multiple storage facilities
• DBMS eliminates most of file system’s problems.
• Current generation stores data structures, relationships between structures, and access
paths. Also defines, stores, and manages all access paths and components

4|Page
SALES
DEPARTMENT
metadata
DATABASE
ACCOUNTING APPLICATIONS DBMS
DEPARTMENT

PERSONNEL
DEPARTMENT

Data previously stored in separate files have been integrated into a single database structure.
The DBMS provides the interface between the various database applications for the
organizational users and the database. The database approach emphasises the integration
and sharing of data throughout the organisation.

Enterprise Data Model:

CUSTOMER PRODUCT

ORDER ORDER
LINE

Segment from enterprise data model.

Pine Valley Furniture Company’s first step in converting to a database approach was to
develop a list of the high _level entities that support the business activities of the
organisation.

An entity is an object that is important to the business. Some of the high-level entities
identified at Pine Valley Furniture Company are the following:

5|Page
x CUSTOMER: People and Organisations that buy products from Pine Valley Furniture.
x ORDER: purchase of one or more products by a customer
x PRODUCT: The items Pine Valley Furniture Company makes and sells
x ORDER LINE: Details about each product sold on a particular customer order.

After these entities were identified and defined the company proceeded to develop an
enterprise data model. An enterprise data model is a graphical model that shows the high-
level entities for the organisation and the associations among those entities.

The 3 associations called relationships shown in figure capture 3 fundamental business


rules:
1. Each CUSTOMER places any number of customer ORDERS.
Each customer ORDER is placed by exactly one CUSTOMER.

2. Each CUSTOMER ORDER contains any number of ORDERLINES.


Each ORDER LINE is contained in exactly one CUSTOMER ORDER.

3. Each PRODUCT has any number of ORDER LINES.


Each ORDER LINE is for exactly one PRODUCT

The results of the preliminary studies convinced management of their potential advantage
of the database approach.

Relational Databases:
The company decided to implement a modern relational database management system that
views all data in the form of tables. The four entities represented by enterprise data model
are converted into tables where each column of a table represents an attribute.

ADVANTAGES OF THE DATABASE MANAGEMENT SYSTEM

Program – Data Independence:


The separation of data description from the application programs that use the data is called
the Data Independence. The Data descriptions are stored in a central location called
repository.

Minimal Data Redundancy:


The design goal with the database approach is that previously separate data files are
integrated into single, logical structure. Each primary fact is recorded in only one place in
database. The database approach does not eliminate redundancy entirely, but it allows the
designer to carefully control the type and the amount of the redundancy.

Improved Data Consistency:

6|Page
By eliminating the data redundancy, the opportunity of reducing the inconsistency has
increased. For example, if a customer’s address is stored only once, we cannot have
disagreement on the stored values. We avoid the wasted storage space that results from
redundant data storage.

Improved Data Sharing:


A database design is a shared resource of corporate. A user view is logical description of
some portion of the database that is required by the user to perform some task. The major
advantage of the database approach is that it greatly reduced the cost and time for
developing new business applications.

Enforcement of Standards:
When the database approach is implemented with full management support, the database
administration function should be granted single point of authority and responsibility for
establishing and enforcing the data standards. These standards include naming conventions,
data quality, uniform procedures for accessing, updating and protecting the data. The data
repository provides database administrators with powerful set of tools for developing and
enforcing these standards. The DBMS provides an easy to use query language that allows
users to get immediate response from their queries rather than having to use a specialist
"programmer" to write queries for them

Improved Data Quality:


Concern with poor quality of the data is a common theme in the database administration
today. Important tool to improve the data quality are:
1. Database designers can specify integrity constraints that are enforced by the DBMS.
A constraint is a rule that cannot be violated by the database users.
2. One of the objectives of a data warehouse environment is to clean up operational data
before they are placed in the data warehouse.

COMPONENTS OF THE DATABASE ENVIRONMENT

The major components of a typical database environment and their relationships are
shown below:

7|Page
1. Computer-aided software engineering (CASE) tools: Automated tools are used to
design the database and application programs.

2. Repository: Centralized storehouse for all data definitions, data relationships,


screen, report formats and other system components. A repository contains an
extended set of metadata important for merging databases as well as other
components of an information system

3. Database Management System (DBMS): A software application that is used to


define, create, maintain and provide controlled access to database and also to the
repository.

4. Database: An organized collection of logically related data, usually designed to meet


the information needs of the multiple needs of multiple users in an organization. It is
important to distinguish between database and repository. The repository contains
the definitions of data, where as the database contains the occurrences of data.

5. Application Programs: Computer programs that are used to create and maintain the
database and provide information to users.

6. User Interface: Languages, Menus and other facilities by which users interact with
various system components such as CASE tools, application programs, the DBMS, and
the repository.

7. Data Administrators: Persons responsible for the overall information resources of


an organization. Data administrators use CASE tools to improve the productivity of
the database planning and design.

8|Page
8. System Developers: persons such as system analysts and programmers who design
new application programs. System developers often use CASE tools for the
requirements analysis and program design.

9. End Users: Persons throughout the organization, who add, delete and modify data in
the database and who request or receive information from it. All the user interactions
with the database must be routed through the DBMS.

EVOLUTION OF DATABASES

1960’s
File processing systems are still dominant during this period. The first database management
systems were introduced during that decade and were used for large and complex ventures
such as “APPOLLO moon landing project”. The first efforts of standardization were taken up
with the formation of data base task group in the late 1960’s

--------------------
--------------------
--------------------

Traditional Files

1970’s:
During this decade the use of database management systems became a commercial reality.
The hierarchical and network database management systems were developed largely to
cope with increasingly complex data structures such as manufacturing the bills of materials
that was extremely difficult to manage with conventional file processing methods. The
network and hierarchical models are generally called as first generation DBMS.
Major Disadvantages:

1. Difficult access to data, based on navigational record-at-a-time procedures.


2. Very limited data independence, so that programs are not insulated from changes to
data formats.
3. No widely accepted theoretical foundation for either model, unlike the relational data
model.

9|Page
Hierarchical Network

1980’s:
To overcome the above limitations, E.F Codd and others developed the relational data model
during the 1970’s. The model was considered second generation DBMS, received wide
spread commercial acceptance and diffusion during the 1980’s. With the relational model,
all the data were represented in the form of tables. A relatively simple fourth generation
language called SQL (for Structured Query Language) is used for data retrieval.

=== ===
=== ===
===

Relational

1990’s:
This decade started the new era of computing, first with client/server computing, then
Internet applications became increasingly important. To cope with the increasingly complex
data, object oriented databases were introduced during the late 1980’s. Since organizations
must manage a vast amount of both structured and unstructured data, both the relational
and object-oriented databases are of great importance today.

==
==

Object-oriented Object-relational

10 | P a g e
2000 and Beyond:
1. The ability to manage increasingly complex data types. These types include
multidimensional data, which is already assumed of importance in data ware house
applications.
2. The continued development of ‘universal servers’ based on object-relational DBMS.
These are database servers that can manage a wide range of data types transparently
to users.
3. Fully distributed databases will become a reality as an organisation will be able to
physically distribute its databases to multiple locations and update them
automatically.
4. Content-addressable storage will become more popular. For example, a user can scan
a photograph and have the computer search for the closest match to that photo.
5. Database and other technologies, such as artificial intelligence and television like
information services will make database access much easier for untrained users.

TYPES OF DATABASES

• Databases can be classified according to:


– Number of users
– Database location(s)
– Expected type and extent of use
• Single-user database supports only one user at a time
– Desktop database: single-user; runs on PC
• Multiuser database supports multiple users at the same time
– Workgroup and enterprise databases
• Centralized database: data located at a single site
• Distributed database: data distributed across several different sites
• Operational database: supports a company’s day-to-day operations
– Transactional or production database
• Data warehouse: stores data used for tactical or strategic decisions

RISKS AND COSTS OF THE DATABASE APPROACH

New, Specialized Personnel:


Frequently, organizations that adopt the database approach need to hire or train individuals
to design and implement databases, provide database administration services and manage a
staff of new people. The Organization should not minimize the need for these specialized
skills, which are required to obtain the most from the potential benefits.

Installation and Management Cost and Complexity:


A multi user database management system is a large and complex suite of software that has
high initial cost, requires a staff of trained personnel to install and operate, and also has
substantial annual maintenance and support costs. Installing such a system may also
required upgrades to the hardware and data communications systems in the organization.

11 | P a g e
Conversion Costs: The term legacy system is widely used to refer to older applications in
an organization that are based on the file processing and or older database technology. The
cost of the converting these older systems to modern systems in terms of dollars, time and
organizational commitment may often seem prohibitive to an organization.

Need for Explicit Backup and Recovery: A shared corporate database must be accurate
and available at all times. This requires that comprehensive procedures be developed and
used for providing backup copies of data and for restoring database when occurs.

Organizational Conflicts: A shared database requires a consensus on the data definitions


and ownership as well as responsibilities for the accurate data maintenance. Handling the
issues such as conflicts on data definitions, data formats and coding , rights to update the
shared data and associated issues are frequent which require organizational commitment to
database approach.

12 | P a g e

You might also like