Database Concepts
Database Concepts
Database : A Database is a collection of logically related data organized in a way that data can be
easily accessed, managed and updated.
Some of the popular database softwares are MS-ACCESS , ORACLE , SYBASE , SQL Server,
MySQL , DB2.
APPLICATION OF DATABASE:
1. Banking: For customer information, accounts, and loans, and banking transactions.
2. Water meter billing : The RR number and all the details are stored in the database and
connected to the server based works.
3. Rail and Airlines: For reservations and schedule information. Airlines were among the first to
use databases in a geographically distributed manner terminals situated around the world accessed
the central database system through phone lines and other data networks.
4. Colleges : For student information, course registrations, and grades.
5. Credit card transactions: For purchases on credit cards and generation of monthly statements.
6. Telecommunication: For keeping records of calls made, generating monthly bills, maintaining
balances on prepaid calling cards, and storing information about the communication networks.
7. Finance: For storing information about holdings, sales, and purchases of financial instruments
such as stocks and bonds.
8. Sales: For customer, product, and purchase information.
9. Manufacturing: For management of supply chain and for tracking production of items in
factories, inventories of items in warehouses/ stores, and orders for items.
10. Human resources: For information about employees recruitment, salaries, payroll taxes and
benefits, and for generation of pay checks.
1|Page
Labour cost is high. Labour cost is economical.
Storage medium is paper Storage medium is Secondary storage
medium.
1. Data Collection: It is the process of systematic gathering of data from various sources that
has been systematically observed, recorded and organized.
2. Data Input: The raw data is put into the computer using a keyboard, mouse or other devices
such as the scanner, microphone and the digital camera.
3. Data Processing: Processing is the series of actions or operations on the input data to
generate outputs.
4. Data storage: Data and information should be stored in memory so that it can be accessed
later.
5. Data Output: The result obtained after processing the data must be presented to the user in
user understandable form. The output can be generated in the form of report as hard copy or soft
copy.
6. Communication: Computers now-a days have communication ability which increases their
power. With wired or wireless communication connections, data may be input from a far place,
processed in a remote area and stored in several different places and then transmitted by modem as
an email or posted to the website where the online services are rendered.
2|Page
Database terms :
* File : File is basic unit of storage in computer system. The file is the large collection of
related data.
* Database: A Database is a collection of logically related data organized in a way that data
can be easily accessed, managed and updated.
* Tables : table is a collection of data elements organized in terms of rows and columns.
* Records: A single entry in a table is called a Record or Row. A Record in a table represents
set of related data.Tuple :Records are also called the tuple.
* Relation : A relation is defined as a table with columns and rows. Data can be stored in the
form of a two dimensional table.
3|Page
DATABASE MANAGEMENT SYSTEM
“A DBMS is a software that allows creation, definition and manipulation of database.”
* DBMS is actually a tool used to perform any kind of operation on data in database.
* DBMS also provides protection and security to database.
* It maintains data consistency in case of multiple users.
* Here are some examples of popular DBMS, MySql, Oracle, Sybase, Microsoft Access and
IBM DB2 etc.
4. Data integrity-
It refers to the validity of data. Data integrity ensures consistent data throughout the
database by following the standard formats that were specified during database
creation.
5. Data security-
As the data is stored at a centralized location, enforcing security constraints such as
identifying key attributes, providing passwords and user rights are easy to implement.
7. Data sharing-
The data stored in the database can be shared among multiple users or application
programs. Moreover, new applications can be developed to use the same stored data.
4|Page
DATA ABSTRACTION
“Data abstraction is the process of hiding certain details of how the data is stored and maintained.”
DBMS users :
The broad classification of dbms users are:-
1. Application programmers and system analysts:
2. End users :
3. Database Administrator (DBA):
4. Database designers:
5|Page
Application programmers and system analysts:
System analysts determine the requirement of end users; especially naive, parametric end users, and
develop specifications for transactions that meet these requirements. Application programmers
implement these parameters in programs.
End users :
People who require access to the database for querying updating and generating reports. The
database exists primarily for/their use.
Database designers:
Database designers are responsible for identifying the data to be stored in the database for choosing
appropriate structures to represent and store the data.
“Schema objects are database objects that contain data or govern or perform operations on
data.”
“Schema is the overall design or structure of the database”.
DATA INDEPENDENCE:
Data Independence is an ability of a database to modify a schema definition at one level without
affecting a schema in the next higher level Two types of data independence are:
1. Physical Data Independence.
2. Logical Data Independence
1. Physical data independence :
The ability of a database to modify physical level definition without affecting the structure at the
application level.
2. Logical data independence :The ability of a database to modify logical level definition without
affecting the structure at the application level.
6|Page
DATA INDEPENDENCE
LOGICAL DATA
PHYSICAL DATA INDEPENDENCE
INDEPENDENCE
1. HIERARCHICAL
1.SEQUENTIAL
2. RELATIONAL
2. DIRECT ACCESS or
RANDOM ACCESS
3. NETWORK
3. ISAM
Advantages :
1. It is a simple file organization system.
2. There need not be any order for storing the records.
3. Helps in easy access of the records.
7|Page
Disadvantages :
1. The access time is more.
2. There are possibilities of duplication of data.
3. The records are not stored in any order of a key field.
4. Any required record cannot be accessed instantly.
8|Page
DATABASE MODEL
1. A Database model defines the logical design of data.
2. It also defines a set of operations that can be performed.
3. A database model provides the necessary means to achieve data abstraction.
4. The model describes the relationships between different parts of the data.
5. The design of the database can be any one of three models :
Hierarchical Model
Network Model
Relational Model
Advantages:
Simplicity: The relationship between the various layers is logically simple.
Data Security: The data security is provided by the DBMS.
Data Integrity: There is always link between the parent segment and the child segment
under it.
Efficiency: It is very efficient because when the database contains a large number of
one to many relationships and when the user requires large number of transaction.
Disadvantages:
Implementation complexity
Database management problem
Lack of structural Independence.
Operational Anomalies
9|Page
NETWORK DATABASE MODEL
Some of the features of the model are:-
1. The first specification of network data model was presented by Conference on Data Systems
Languages (CODASYL) in 1969.
2. In a network model the data is represented by a collection of records, and relationships
among data are represented by links.
3. However, the link in a network data model represents an association between precisely two
records.
4. Each record of a particular record type represents a node.
5. All the nodes are linked to each other without any hierarchy.
6. The data is organized in the form of graphs and some entities can be accessed through several
paths.
Advantages:
It is simple and easy to implement.
It can handle many relationships within the organization.
It has better data independence compared to hierarchical model.
Disadvantages:
More complex system of database structure
Lack of structural dependence
10 | P a g e
RELATIONAL DATABASE MODEL
Some of the features of the model are:-
1. The relational data model was developed by E. F. Codd in 1970.
2. All data is maintained in the form of two-dimensional tables (generally, known as relations)
consisting of rows and columns.
3. Each row (record) represents an entity and a column (field) represents an attribute of the
entity.
4. The relationship between the two tables is implemented through a common attribute in the
tables.
5. This makes the querying much easier in a relational database system.
6. Oracle, Sybase, DB2, Ingres, Informix, MS-SQL Server are few of the popular relational
DBMSs.
Advantages
1. Prevents Data redundancy
2. Data security: Database administrator has the authority of giving access of data to some particular
users which makes the data secure.
3. Easy to use
Disadvantages
1.This database has a slow extraction of results thus making it a slower database.
2. Memory space: consumes a lot of physical memory for tables
DBMS ARCHITECTURE.
The design of Database Management System highly depends on its architecture.
It can be centralized or decentralized or hierarchical.
Database architecture is logically divided into three types.
* Logical one-tier in 1-tier Architecture
* Logical two-tier Client/Server Architecture.
* Logical three-tier Client/Server Architecture.
11 | P a g e
It does not provide handy tools for end users and preferably database designers and programmers
use single tier architecture.
ER-Diagram is a visual representation of data that describes how data is related to each other.
Entity:
An Entity can be any object, place, person or class.
o In E-R Diagram, an entity is represented using rectangles.
o Rectangles are named with the entity set they represent.
Attribute:
An Attribute describes a property or characteristic of an entity.
o Attributes are represented by means of eclipses.
12 | P a g e
o Every eclipse represents one attribute and is directly connected to its entity (rectangle).
o For example, Roll_No, Name and Birth date can be attributes of a student
Relationship:
A relationship type is a meaningful association between entity types.
o Relationship is represented using diamond shaped box.
o Relationship types are represented on the E-R diagram by a series of lines.
NAME
TEACH LEARN
13 | P a g e
2. One to Many:
It reflects business rule that one entity is associated with many number of same entity. For
example, Student enrolls for only one Course but a Course can have many Students. The arrows in
the diagram describes that one student can enroll for only one course.
3. Many to Many:
The above diagram represents that many students can enroll for more than one course.
KEYS
“They are used to establish and identify relation between tables. “
The key is a set of one or more columns whose combined values are unique among all occurrences
in a given table.
The different types of keys are:
1. Primary key:
It is a field in a table which uniquely identifies each row/record in a database table.
Primary keys must contain unique values.
A primary key column cannot have NULL values.
Ex: In Relation STUDENT, Regno serves as a primary key.
2. Candidate Key:
When more than one or group of attributes serve as a unique identifier, they are each called
as candidate key
3. Alternate Key :
The alternate key of any table are those candidate keys which are not currently selected as
the primary key. This is also known as secondary key.
4. Foreign key :
A key used to link two tables together is called a foreign key. This is sometimes called a
referencing key.
Foreign key is a field that matches the primary key column of another table.
5. Super Key :
A superkey is basically all sets of columns for which no two rows share the same values for
those sets.
14 | P a g e
An attribute or set of attributes that uniquely identifies a tuple within a relation/table. Super Key
is a superset of Candidate key.
6. Composite Key :
Key that consists of two or more attributes that uniquely identify an entity occurrence is
called Composite key.
DATA WAREHOUSE
“A data ware house is a repository of an organization’s electronically stored data.”
Data ware house have evolved though several fundamental stages like:
2. Data transformation:
The data transformation layer recieve data from the data sources cleaned & standardizes & loads it
into the data repository. This is often called “staging”data as data often passes through a temporary
database whilst it is being transformed.
3. Reporting:
The data in the data warehouse must be available to the organization’s staff if the data warehouse is
to be useful. There are a very large number of applications that perform this function or reporting
can be custom-developed. Some are Bussiness intelligence tools, Executive information system,
online Analytical processing (OLAP) Tools, Data Mining etc.,
15 | P a g e
4. Metadata:
Metadata or “Data about data” is used to inform operators & uses of the data
warehouses about its status & the information held within the data warehouses.
5. Operations:
Data warehouses operations comprises of the processes of loading, manipulating & extracting data
from the data warehouse. Operations also cover uses management security, capacity management &
related functions.
6. Optional components:
In addition the following components also exist in same data warehouse:
Dependent data marts.
Logical data marts.
Operational data store.
Advantages
1. Enhance end-user access to reports and analysis of information.
2. Increases data consistency.
3. Increases productivity and decreases computing costs.
4. Able to combine data from different sources, in one place.
5. Data warehouses provide an infrastructure that could support changes todata and replication
of the changed data back into the operational systems.
Disadvantages
1. Extracting, cleaning and loading data could be time consuming.
2. Data warehouses can get outdated relatively quickly.
3. Problems with compatibility with systems already in place.
4. Providing training to end-users.
5. Security could develop into a serious issue, especially if the data warehouses is internet
accessible.
6. A data warehouses is usually not static and maintenance costs are high.
DATA MINING
“Data mining is concerned with the analysis and picking out relevant information.”
It is the computer, which is responsible for finding the patterns by identifying the underling rules of
the features in the data.
1. Selection:
Selecting or segmenting the data according to some criteria. For Example all those people who won
a car, in this way subsets of the data can be determined.
2. Preprocessing:
This is the data cleaning stage’ where certain information is removed which deemed unnecessary
and may slow down queries for example gender of the patient.
16 | P a g e
3. Transformation:
The data is not merely transferred, but transformed. For example, demographic overlays commonly
used in market research. The data is made useable and navigable.
4. Data mining:
This stage is concerned with the extraction of patterns from the data.
17 | P a g e
Five marks questions
1. Explain data processing cycle?
2. Explain various datatypes used in DBMS?
3. Explain data independence in detail.
4. Discuss file organization with respect to physical data independence.
5. Explain the features of database system.
6. Explain DBMS Architecture.
7. Explain database model.
8. List any five types of relational keys.
9. Explain Entity-Relationship in detail.
10. Explain the concept of Data abstraction.
11. Define and explain the phases of data mining.
18 | P a g e