0% found this document useful (0 votes)
22 views18 pages

Database Concepts

Uploaded by

m25990389
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views18 pages

Database Concepts

Uploaded by

m25990389
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

CHAPTER –DATABASE CONCEPTS

DATA and INFORMATION


Data is the collection of raw facts, figures, statistics which can be processed to produce meaning
information.
Information is the processed data with some definite meaning.
For ex. : Student marks are processed to get the final result.

Database : A Database is a collection of logically related data organized in a way that data can be
easily accessed, managed and updated.
Some of the popular database softwares are MS-ACCESS , ORACLE , SYBASE , SQL Server,
MySQL , DB2.

APPLICATION OF DATABASE:
1. Banking: For customer information, accounts, and loans, and banking transactions.
2. Water meter billing : The RR number and all the details are stored in the database and
connected to the server based works.
3. Rail and Airlines: For reservations and schedule information. Airlines were among the first to
use databases in a geographically distributed manner terminals situated around the world accessed
the central database system through phone lines and other data networks.
4. Colleges : For student information, course registrations, and grades.
5. Credit card transactions: For purchases on credit cards and generation of monthly statements.
6. Telecommunication: For keeping records of calls made, generating monthly bills, maintaining
balances on prepaid calling cards, and storing information about the communication networks.
7. Finance: For storing information about holdings, sales, and purchases of financial instruments
such as stocks and bonds.
8. Sales: For customer, product, and purchase information.
9. Manufacturing: For management of supply chain and for tracking production of items in
factories, inventories of items in warehouses/ stores, and orders for items.
10. Human resources: For information about employees recruitment, salaries, payroll taxes and
benefits, and for generation of pay checks.

The 2 types of Data processing systems are:-


1. Manual data processing system
2. Electronic data processing system

Manual Data processing Computerized Electronic Data


processing
The Volume of the data, which can be The volume of data which can be
processed, is limited in a desirable time. processed can be very large.
Manual data processing requires large Reasonable less amount of paper is
quantity of paper used.
The speed and accuracy at which the job is The job executed is faster and
executed is limited. Accurate.

1|Page
Labour cost is high. Labour cost is economical.
Storage medium is paper Storage medium is Secondary storage
medium.

DATA PROCESSING CYCLE

The information processing cycle consists of following specific steps:

1. Data Collection: It is the process of systematic gathering of data from various sources that
has been systematically observed, recorded and organized.
2. Data Input: The raw data is put into the computer using a keyboard, mouse or other devices
such as the scanner, microphone and the digital camera.
3. Data Processing: Processing is the series of actions or operations on the input data to
generate outputs.
4. Data storage: Data and information should be stored in memory so that it can be accessed
later.
5. Data Output: The result obtained after processing the data must be presented to the user in
user understandable form. The output can be generated in the form of report as hard copy or soft
copy.
6. Communication: Computers now-a days have communication ability which increases their
power. With wired or wireless communication connections, data may be input from a far place,
processed in a remote area and stored in several different places and then transmitted by modem as
an email or posted to the website where the online services are rendered.

2|Page
Database terms :
* File : File is basic unit of storage in computer system. The file is the large collection of
related data.

* Database: A Database is a collection of logically related data organized in a way that data
can be easily accessed, managed and updated.

* Tables : table is a collection of data elements organized in terms of rows and columns.

* Records: A single entry in a table is called a Record or Row. A Record in a table represents
set of related data.Tuple :Records are also called the tuple.

* Fields : Each Columns is identified by a distinct header called attribute or field .

* Domain :Set of values for an attribute in that column.

* Entity: An Entity is an object such as a table or Form

* Instance: The collection of information stored in the database at a particular moment is


called an instance of the database.

* Relation : A relation is defined as a table with columns and rows. Data can be stored in the
form of a two dimensional table.

Data types of DBMS


1. Integer – Hold whole number without fractions.
2. Single and double precision – Seven significant value for a number.
3. Logical data type-Store data that has only two values true or false.
4. Characters – Include letter, number, spaces, symbols and punctuation. Characters fields or variables
store text information like name, address, but size will be one byte.
5. Strings – Sequence of character more than one. Fixed length is 0 to 63Kb and dynamic strings length
range from 0 to 2 billion characters.
6. Memo data type – Store more than 255 characters. A memo fields can store up to 65536 characters.
Long documents can store OLE objects.
7. Index fields –Used to store relevant information along with the documents. The document input to an
index field is used to find those documents when needed. The programs provides up to 25 user
definable index fields in an index set. Name drop-down look-up list, Standard, auto-complete
History list.
8. Currency fields – The currency field accepts data in dollar form by default.
9. Date fields -The date fields accepts data entered in date format.
10. Text fields – Accepts data as an alpha-numeric text string.

3|Page
DATABASE MANAGEMENT SYSTEM
“A DBMS is a software that allows creation, definition and manipulation of database.”

* DBMS is actually a tool used to perform any kind of operation on data in database.
* DBMS also provides protection and security to database.
* It maintains data consistency in case of multiple users.
* Here are some examples of popular DBMS, MySql, Oracle, Sybase, Microsoft Access and
IBM DB2 etc.

FEATURES OR ADVANTAGES OF DATABASE SYSTEM


1. Centralized data management-
The data is stored at a central location and shared among multiple users. This helps in
easy management of the entire database

2. Ease of application development-


The programmers can develop any number of applications easily as the database takes
care of the issues like security, redundancy, multiple access and data integrity.

3. Backup and recovery-


The database provides backup and recovery subsystem that is responsible for recovery
from hardware and software failures.

4. Data integrity-
It refers to the validity of data. Data integrity ensures consistent data throughout the
database by following the standard formats that were specified during database
creation.

5. Data security-
As the data is stored at a centralized location, enforcing security constraints such as
identifying key attributes, providing passwords and user rights are easy to implement.

6. Controlled data redundancy-


Data redundancy occurs in database systems which have a field that is repeated in two
or more tables. Different techniques such as normalization use of foreign keys can
minimize data redundancy.

7. Data sharing-
The data stored in the database can be shared among multiple users or application
programs. Moreover, new applications can be developed to use the same stored data.

8. Multiple User interface-


Multiple users or multiple applications can share common data . The database will
efficiently handle the requirements of all the users without creating additional data or
duplicating the data.

4|Page
DATA ABSTRACTION
“Data abstraction is the process of hiding certain details of how the data is stored and maintained.”

In a DBMS Data abstraction is described in three levels.

Physical or internal level


* It deals with the physical representation of the database , describes how the data is physically stored
and organized on the storage medium.
* At this level, various aspects are considered to achieve optimal runtime performance and storage
space utilization.

Logical or conceptual level


* Conceptual level deals with the logical structure of the database , describes what data is stored in the
database, the relationships among the data and complete view of the user’s requirements without any
concern for the physical implementation.
* That is, it hides the complexity of physical storage structures.

View or end user or external level


* This is the highest level of abstraction that deals with the user’s view of the database and thus, is also
known as VIEW LEVEL
* The external level describes a part of the database for a particular group of users.

DBMS users :
The broad classification of dbms users are:-
1. Application programmers and system analysts:
2. End users :
3. Database Administrator (DBA):
4. Database designers:

5|Page
Application programmers and system analysts:
System analysts determine the requirement of end users; especially naive, parametric end users, and
develop specifications for transactions that meet these requirements. Application programmers
implement these parameters in programs.

End users :
People who require access to the database for querying updating and generating reports. The
database exists primarily for/their use.

Database Administrator (DBA):


DBA is responsible for authorization access to the database for coordinating and monitoring its use,
and for acquiring the needed software and hardware resources.

Database designers:
Database designers are responsible for identifying the data to be stored in the database for choosing
appropriate structures to represent and store the data.

“Schema objects are database objects that contain data or govern or perform operations on
data.”
“Schema is the overall design or structure of the database”.

DATA INDEPENDENCE:
Data Independence is an ability of a database to modify a schema definition at one level without
affecting a schema in the next higher level Two types of data independence are:
1. Physical Data Independence.
2. Logical Data Independence
1. Physical data independence :
The ability of a database to modify physical level definition without affecting the structure at the
application level.
2. Logical data independence :The ability of a database to modify logical level definition without
affecting the structure at the application level.

6|Page
DATA INDEPENDENCE

LOGICAL DATA
PHYSICAL DATA INDEPENDENCE
INDEPENDENCE

FILE DATA MODEL NORMALIZATION ER-DIAGRAM


ORGRANIZATION

1. HIERARCHICAL
1.SEQUENTIAL
2. RELATIONAL
2. DIRECT ACCESS or
RANDOM ACCESS
3. NETWORK
3. ISAM

SEQUENTIAL FILE ORGANIZATION


1. Records are stored one after another in an ascending or descending order determined by the
key field of the records.
2. Example : payroll file where records are stored in the order of employee id.
3. Sequentially organized files that are processed by computer systems are normally stored on
storage media such as magnetic tape, punched cards, or magnetic disks.
4. To access these records the computer must read the file in sequence from the beginning.
5. The first record is read and processed first, then the second record in the file sequence, and so
on.
6. To locate a particular record, the computer program must read each record in sequence and
compare its key field to the one that is needed.
7. The addition of new records takes place only at the end of the file.
8. On an average, about half of the file has to be searched to retrieve the desired record from a
sequential file.

Advantages :
1. It is a simple file organization system.
2. There need not be any order for storing the records.
3. Helps in easy access of the records.

7|Page
Disadvantages :
1. The access time is more.
2. There are possibilities of duplication of data.
3. The records are not stored in any order of a key field.
4. Any required record cannot be accessed instantly.

RANDOM ACCESS OR DIRECT ACCESS FILE ORGANIZATION


1. Direct access file organization allow immediate direct access to individual records in the
file.
2. The record are stored and retrieved using a relative record number, which gives the position
of the record in the file.
3. This type of organization also allows the file to be accessed sequentially.
4. The primary storage in a CPU truly provides for direct access.
5. The direct access storage devices have the capability of directly reaching any location.
6. There are several types of storage devices including discs and other mass storage.
Advantages:
1. The access to, and retrieval of a records is quick and direct.
2. Transactions need not be stored and placed in sequence prior to processing
3. Best used for online transaction.
Disadvantages:
1. Address generation overhead is involved for accessing each record due to hashing
function.
2. May be less efficient in the use of storage space than sequentially organized
files.

INDEX-SEQUENTIAL FILE ORGANIZATION


1. The index sequential file organization is a combination of Sequential file organization and an
Index file.
2. Also referred as ISAM (indexed sequential access method).
3. Data is stored physically in adjacent storage locations and there exists a logical relationship
among the data stored by using ordering field.
4. An additional file called as Index file would be created, which contains n number of records.
5. Each record of index file has two fields:
 The field is of the same data type as the ordering key field and
 The second field is a pointer to a disk block (a block address).
Advantages
1. Search time is less.
2. There are fewer index entries than there are records in the data file.
3. Quick access to the records even when the volume of records is high.
Disadvantages
1. Additional file (index file) has to be created.
2. Wastage of storage space by creating and maintaining the index file.
3. Always indirect retrieval of data because first search begins in the index files then
moves to the data file (No direct retrieval).

8|Page
DATABASE MODEL
1. A Database model defines the logical design of data.
2. It also defines a set of operations that can be performed.
3. A database model provides the necessary means to achieve data abstraction.
4. The model describes the relationships between different parts of the data.
5. The design of the database can be any one of three models :
 Hierarchical Model
 Network Model
 Relational Model

HIERARCHICAL DATA MODEL


Some of the features of the model are:-
a. The hierarchical data model is the oldest type of data model, developed by IBM in 1968.
b. This data model organizes the data in a tree-like structure, in which each child node (also
known as dependents) can have only one parent node.
c. The database based on the hierarchical data model comprises a set of records connected to
one another through links.
d. The link is an association between two or more records.
e. The top of the tree structure consists of a single node that does
f. not have any parent and is called the root node.
g. The root may have any number of dependents; each of these dependents may have any
number of lower level dependents.
h. Each child node can have only one parent node and a parent node can have any number of
child nodes.
i. The Hierarchical model represents one-to-one and one-to-many
j. relationships.
k. The collection of same type of records is known as a record type.
l. One complete record of each record type represents a node.

Advantages:
 Simplicity: The relationship between the various layers is logically simple.
 Data Security: The data security is provided by the DBMS.
 Data Integrity: There is always link between the parent segment and the child segment
under it.
 Efficiency: It is very efficient because when the database contains a large number of
one to many relationships and when the user requires large number of transaction.
Disadvantages:
 Implementation complexity
 Database management problem
 Lack of structural Independence.
 Operational Anomalies

9|Page
NETWORK DATABASE MODEL
Some of the features of the model are:-
1. The first specification of network data model was presented by Conference on Data Systems
Languages (CODASYL) in 1969.
2. In a network model the data is represented by a collection of records, and relationships
among data are represented by links.
3. However, the link in a network data model represents an association between precisely two
records.
4. Each record of a particular record type represents a node.
5. All the nodes are linked to each other without any hierarchy.
6. The data is organized in the form of graphs and some entities can be accessed through several
paths.

Advantages:
 It is simple and easy to implement.
 It can handle many relationships within the organization.
 It has better data independence compared to hierarchical model.
Disadvantages:
 More complex system of database structure
 Lack of structural dependence

10 | P a g e
RELATIONAL DATABASE MODEL
Some of the features of the model are:-
1. The relational data model was developed by E. F. Codd in 1970.
2. All data is maintained in the form of two-dimensional tables (generally, known as relations)
consisting of rows and columns.
3. Each row (record) represents an entity and a column (field) represents an attribute of the
entity.
4. The relationship between the two tables is implemented through a common attribute in the
tables.
5. This makes the querying much easier in a relational database system.
6. Oracle, Sybase, DB2, Ingres, Informix, MS-SQL Server are few of the popular relational
DBMSs.

Advantages
1. Prevents Data redundancy
2. Data security: Database administrator has the authority of giving access of data to some particular
users which makes the data secure.
3. Easy to use
Disadvantages
1.This database has a slow extraction of results thus making it a slower database.
2. Memory space: consumes a lot of physical memory for tables

DBMS ARCHITECTURE.
 The design of Database Management System highly depends on its architecture.
 It can be centralized or decentralized or hierarchical.
 Database architecture is logically divided into three types.
* Logical one-tier in 1-tier Architecture
* Logical two-tier Client/Server Architecture.
* Logical three-tier Client/Server Architecture.

One-tier in 1-tier Architecture:


 DBMS is the only entity where user directly sits on DBMS and uses it.
 Any changes done here will directly be on DBMS itself.

11 | P a g e
 It does not provide handy tools for end users and preferably database designers and programmers
use single tier architecture.

Two-tier Client / Server Architecture:


 Two-tier Client / Server architecture is used for User Interface program and Application Programs
that runs on client side.
 An interface called ODBC (Open Database Connectivity) provides an API that allows client side
program to call the DBMS.
 Most DBMS vendors provide ODBC drivers. A client program may connect to several DBMS's.
In this architecture some variation of client is also possible for example in some DBMS's more
functionality is transferred to the client including data dictionary, optimization etc.

Three-tier Client / Server Architecture:


 Three-tier Client / Server database architecture is commonly used architecture for web
applications.
Intermediate layer called Application server or Web Server stores the web connectivity software and
the business logic (constraints) part of application used to access the right amount of data from the
database server.
 This layer acts like medium for sending partially processed data between the database server and
the client.

ENTITY RELATIONSHIP (ER) DIAGRAM

ER-Diagram is a visual representation of data that describes how data is related to each other.
 Entity:
An Entity can be any object, place, person or class.
o In E-R Diagram, an entity is represented using rectangles.
o Rectangles are named with the entity set they represent.

 Attribute:
An Attribute describes a property or characteristic of an entity.
o Attributes are represented by means of eclipses.

12 | P a g e
o Every eclipse represents one attribute and is directly connected to its entity (rectangle).
o For example, Roll_No, Name and Birth date can be attributes of a student

 Relationship:
A relationship type is a meaningful association between entity types.
o Relationship is represented using diamond shaped box.
o Relationship types are represented on the E-R diagram by a series of lines.

NAME

REGNO STUDENT COURSE

TEACH LEARN

IDNO TEACHER SUBJECT

A Relationship describes relations between entities.


 There are three types of relationship that exist between entities.
o Binary Relationship
o Recursive Relationship
o Ternary Relationship

 Binary Relationship: It means relation between two entities.


This is further divided into three types.
1. One to One:
 This type of relationship is rarely seen in real world.  The above example describes that one
student can enroll only for one course and a course will also have only one Student. This is not what
you will usually see in relationship.

13 | P a g e
2. One to Many:
 It reflects business rule that one entity is associated with many number of same entity.  For
example, Student enrolls for only one Course but a Course can have many Students.  The arrows in
the diagram describes that one student can enroll for only one course.

3. Many to Many:
 The above diagram represents that many students can enroll for more than one course.

KEYS
“They are used to establish and identify relation between tables. “
The key is a set of one or more columns whose combined values are unique among all occurrences
in a given table.
The different types of keys are:
1. Primary key:
 It is a field in a table which uniquely identifies each row/record in a database table.
 Primary keys must contain unique values.
 A primary key column cannot have NULL values.
 Ex: In Relation STUDENT, Regno serves as a primary key.

2. Candidate Key:
 When more than one or group of attributes serve as a unique identifier, they are each called
as candidate key

3. Alternate Key :
 The alternate key of any table are those candidate keys which are not currently selected as
the primary key.  This is also known as secondary key.

4. Foreign key :
 A key used to link two tables together is called a foreign key.  This is sometimes called a
referencing key.
 Foreign key is a field that matches the primary key column of another table.

5. Super Key :
 A superkey is basically all sets of columns for which no two rows share the same values for
those sets.

14 | P a g e
 An attribute or set of attributes that uniquely identifies a tuple within a relation/table. Super Key
is a superset of Candidate key.

6. Composite Key :
 Key that consists of two or more attributes that uniquely identify an entity occurrence is
called Composite key.

DATA WAREHOUSE
“A data ware house is a repository of an organization’s electronically stored data.”

Data ware house have evolved though several fundamental stages like:

1 . Offline operational databases –


Data warehouse in this initial stage are developed by simply copying the database of an
operational system to an off-line server where the processing load of reporting does not impact on
the operational system’s performance.

2. Offline data warehouse –


Database warehouses in this stages of evolution are updated on regular time cycle(usually
daily, weekly or monthly) form operational systems and the data is stored in a integrated reporting-
oriented data structure.

3. Real Time data warehouse –


Data warehouses are updated on transaction or event basis, event time an operational system
performs a transaction.
4. Integrated data warehouses –
Data warehouses used to generate activity or transactions that are passed back into the
operational systems for use in the daily activity of the organization.

Components of data warehouses


1. Data Sources:
Data sources refer to any electronic repository of information that contains data of interest for
management use or anaytics. Data needs to be passed from these to systems to the data warehouse
either on the transaction-by-transaction basis for real-time data warehouses or on a regular cycle (e.g
daily or weekly) of offline data warehouse.

2. Data transformation:
The data transformation layer recieve data from the data sources cleaned & standardizes & loads it
into the data repository. This is often called “staging”data as data often passes through a temporary
database whilst it is being transformed.

3. Reporting:
The data in the data warehouse must be available to the organization’s staff if the data warehouse is
to be useful. There are a very large number of applications that perform this function or reporting
can be custom-developed. Some are Bussiness intelligence tools, Executive information system,
online Analytical processing (OLAP) Tools, Data Mining etc.,

15 | P a g e
4. Metadata:
Metadata or “Data about data” is used to inform operators & uses of the data
warehouses about its status & the information held within the data warehouses.

5. Operations:
Data warehouses operations comprises of the processes of loading, manipulating & extracting data
from the data warehouse. Operations also cover uses management security, capacity management &
related functions.

6. Optional components:
In addition the following components also exist in same data warehouse:
 Dependent data marts.
 Logical data marts.
 Operational data store.

Advantages
1. Enhance end-user access to reports and analysis of information.
2. Increases data consistency.
3. Increases productivity and decreases computing costs.
4. Able to combine data from different sources, in one place.
5. Data warehouses provide an infrastructure that could support changes todata and replication
of the changed data back into the operational systems.

Disadvantages
1. Extracting, cleaning and loading data could be time consuming.
2. Data warehouses can get outdated relatively quickly.
3. Problems with compatibility with systems already in place.
4. Providing training to end-users.
5. Security could develop into a serious issue, especially if the data warehouses is internet
accessible.
6. A data warehouses is usually not static and maintenance costs are high.

DATA MINING
“Data mining is concerned with the analysis and picking out relevant information.”
It is the computer, which is responsible for finding the patterns by identifying the underling rules of
the features in the data.

Phases of Data mining are :-

1. Selection:
Selecting or segmenting the data according to some criteria. For Example all those people who won
a car, in this way subsets of the data can be determined.

2. Preprocessing:
This is the data cleaning stage’ where certain information is removed which deemed unnecessary
and may slow down queries for example gender of the patient.

16 | P a g e
3. Transformation:
The data is not merely transferred, but transformed. For example, demographic overlays commonly
used in market research. The data is made useable and navigable.

4. Data mining:
This stage is concerned with the extraction of patterns from the data.

5. Interpretation and Evaluation:


The patterns identified by the system are interpreted into knowledge which can be used to support
human decision – making. For example prediction and classification tasks, summarizing the content
of a database or explaining observed phenomena.

Frequently asked questions


One mark questions
1. What is data?
2. What is information?
3. What is database?
4. What is a field?
5. What is a record?
6. What is an entity?
7. What is an instance?
8. What is an attribute?
9. What is domain?
10. What is a relation?
11. What is a table?
12. What is a key?
13. What is data mining?

Two marks questions

1. How database helps us?


2. How do we get data?
3. Name the data types supported by DBMS.
4. What is the difference between serial and direct access file organization?
5. Give the advantages and disadvantages of Index Sequential Access Method.
6. Classify various types of keys used in Database.
7. What is data warehouse?
8. What is data mining?
Three marks questions
1. Mention the applications of database.
2. List different forms of data(any three)
3. Give the difference between Manual and Electronic file systems.
4. Explain any three components of E-R diagram.
5. What is a relationship? Classify and give example.
6. Explain physical data independence.
7. Explain ISAM with example
8. Explain database users
9. Explain hierarchical data model.
10. Explain relational data model.
11. List the components of data warehouse.

17 | P a g e
Five marks questions
1. Explain data processing cycle?
2. Explain various datatypes used in DBMS?
3. Explain data independence in detail.
4. Discuss file organization with respect to physical data independence.
5. Explain the features of database system.
6. Explain DBMS Architecture.
7. Explain database model.
8. List any five types of relational keys.
9. Explain Entity-Relationship in detail.
10. Explain the concept of Data abstraction.
11. Define and explain the phases of data mining.

18 | P a g e

You might also like