0% found this document useful (0 votes)
12 views30 pages

Data Basis

The document provides an overview of data and information, explaining the importance of data in management and decision-making. It details operations performed on data, types of files, and the organization of files, as well as the structure and components of databases and database management systems (DBMS). Additionally, it discusses the advantages and disadvantages of DBMS, along with basic concepts and terminologies related to databases.

Uploaded by

hamzapubgid4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views30 pages

Data Basis

The document provides an overview of data and information, explaining the importance of data in management and decision-making. It details operations performed on data, types of files, and the organization of files, as well as the structure and components of databases and database management systems (DBMS). Additionally, it discusses the advantages and disadvantages of DBMS, along with basic concepts and terminologies related to databases.

Uploaded by

hamzapubgid4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

CHAPTER 01

DATA BASIS
Data And Information
Roll
No.

Data
Data is a collection of facts, figures and statistics - related to an object, that can be
processed to produce a meaningful information. Name DATA Class

Importance of Data
 Used by managers to perform effective and successful operations of management Marks
 Provides a view of past activities related to the rise and fall of an organization
 Enables the organization to make better decision for future activities
RN. NAME CLASS SUBJECT
Information 101 Ayesha 12TH Biology
The manipulated and processed data is called information
e.g., the Marks of students results. It is an output of a certain process. 102 Fatima 12TH Physics
103 Khadijah 12TH Computer
Data Information
Data is raw facts & Statistics Processed form of data

Used as input in the computer Output of computer DATA PROCESS INFO

Doesn’t depend on information. Information depends on data.

Unorganized, Unstructred Organized, Structured Raw Data Calculations Record


Operations Performed On Data
Manipulation of data (after capturing from different sources) to achieve the required objectives and results. For this purpose, a software (program) is used
to process raw data which is converted to meaningful information.

1. Data Capturing:
Data Capturing
Data must be recorded or captured in some form before it can be processed.
2. Data Manipulation:
The following operations may then be performed on the gathered data. Data Manipulation
Activities
 Classifying: Organizing data into classes /groups. Items may be assigned predetermined
codes, they can be numeric, alphabetic or alphanumeric.
 Calculations: Arithmetic manipulation of the data. Managing The Output Results
 Sorting: Data is arranged in logical sequence (numerically or alphabetically).
 Summarizing: Masses of data are reduced to a more concise and usable form.
3. Managing The Output Results:
Once the data is captured & manipulated it may be:
 Storing and Retrieval: Data is retained for future reference. Accessing / fetching the
stored data and / or information is the Retrieve Activity.
 Communication and Reproduction: Data may be transferred from one location or
operation to another, for further processing. It is sometimes necessary to copy or to make
duplicate of data, called Reproduction.
Field, Record & File [Types]
Field:
 A combination of one or more characters
 Represents Smallest unit of data
 Name of each field in a record is unique
 Each field contains one specific piece of information
Record:
 A collection of related fields (facts about something) is called a record
 Treated as a single unit
File:
 A collection of related records used as single unit
 Files are stored on different storage media such as hard disk, USB flash drive
or optical disc (CDs and DVDs)

File Types [Usage Point of View] File Types [Functional Point of View]
 Master File: These are the latest updated files which never  Program Files: These files contain the software instructions

become empty, ever since they are created. They maintain i.e. source program files and executable files. The source

information that remains constant over a long period of program files may have the extension as .cpp and the

time. executable files as .exe.

 Transaction File: Files in which data prior to the stage of  Data Files: These files contain data and are created by the

processing is recorded. It may be temporary file, retained till software being used. A few of these are: Word Processor

the master file is updated. .doc, .rtf (document), Spread Sheet .xls and .wks

 Backup File: Permanent files, for the purpose of protection (worksheet), Video files .avi, .mpg etc.

of vital data.
Oragnization of Files

A technique for physically arranging records of file on secondary storage devices


. Sequential Files
 Records are stored on the storage media in a sequence
 Records can be retrieved only in sequence in which they were stored Major 01
disadvantage is very slow access time for a particular record Sequential Files
File
Direct or Random Files
 Records are not stored in a particular sequence
Organization 02 Direct or Random Files
 The records are stored at known address or location Ways
 The address or location is calculated against the value of the key field of the
Indexed Sequential Files
record Synonym problem→ If the same address is calculated to store two or
more records 03
 Faster than sequential file organization for finding a particular record
 Storage media for direct file organization are hard disk , optical discs( CDs,
DVDs)

Indexed Sequential Files


 Records are stored in ascending or descending order based on value called key
 An index value is generated for each key and mapped with the record
 Index refers to the location or address on a disk where a record is stored
 The index is stored in a file called index file
Index file contains the value of :
 Each key field
 Disk address of record with corresponding key field
 Index file is updated whenever a record is added or deleted from the file
Problems in File Processing Syystem

File Processing System


 This system is used by different organization to store and manage data
 Each department has its own set of data files and application program
 Each program defines and manages its own data
 Every Process generate its separate files and does not communicate with
each other

01 Data Redundancy 03 Data Dependence 05 Lack of Flexibility

Duplication of data in The application program has to be Combined reports are very difficult to
changed if the format of file is display as data is scattered in
multiple files
changed. different files.

Two files many contain It is not possible to define


different data about the same Integrity means reliability and
different access levels for
thing. accuracy of data.
different users.

02 Data Inconsistency 04 Data Integrity 06 Lack of Data Security


Database & Facilities

Database Adding New Files Inserting Data


A database is a collection of logically related data sets or files.
1 2
For example; A bank may have separate files for its clients i.e.
 Savings A/C
 Automobile loan 4 3

 Personal loan
Updating Data Retrieving Data
 Clients biographic information
5 6

Deleting Data Removing


Existing Files
Objectives of Database

Data Data
Data Integrity

1 2 3
Integration Independence
If a data item is
Information is When the format of
contained in more
coordinated from a file is changed,
than one file, then all
different files and then all the
files must be updated
operated on a single programs have to be
if that item is
file. changed. However, a
changed.
database allow
programs to be
modified without
reorganization of
data.
Components of Database

1. Data
Main purpose of database system is:
 To store data DATA
 To maintain data
 To Process data
HARDWARE
2. Hardware
 Physical components of computer
SOFTWARE
 Used to perform different tasks such as input, output, storage and
processing
Example of Hardware components PERSONNEL
• Secondary storage
• I/O devices
• Processors
• Main memory

3. Software
 Collection of programs used by the computer within database system 4. Personnel
DBMS: People related to the database system
Use to create and manage a database in database system Database Administrator (DBA)
Application Program Person who is responsible to manage the whole database system
Used to access and process the data stored in database Application Programmer
Operating System Person who writes the application program to access data from database
Manage all hardware components End Users
Enables all other software to run on the computer Persons who perform different operations on database
Access DBMS through Application program
Database Models

1. Hierarchical Model
 Records are arranged in a hierarchy like an organizational chart E4
 Each record type is called a node or segment E2
 Node represents a particular entity E5
 Topmost node is root E1
 Use Parent /Child relationship
 Each parent node can have many child nodes
E3 E6
 Each child node may have only one parent node Hierarchical Model
 One-to-many relationship between data entities
 Kind of structure → Inverted tree
E1 E2
2. Network Model
 Similar to hierarchical model but one difference E3 E4
 A child node may have any number of parent nodes
 Child nodes represented by arrows E5 E6
 Complex diagram to represent a database
 Provides more flexibility than hierarchical model Network Model

3. Relational Model E1 E4
 Most commonly used database model
 More flexible than hierarchical and network database model
 Consists of a collection of simple relations or tables E2 E5
 Relation represents a particular entity to store information about entity
 Relationships are based on the data of the entities
 Relationship between entities is represented by diagram E3 E6
Relational Model
DBMS (Database Management System)
The data management system (a collection of programs) which is used for storing and manipulating databases is called database management system
(DBMS). DBMS software (database manager) controls the overall structure of a database and access to the data itself.

Objectives of Database Management System (DBMS):


 Share Ability: Different people and processes must be able to use the same data at the same time.
 Availability: Both the data and DBMS must be easily accessible to the users.
 Evolvability: The ability of the DBMS to change in response to growing user needs and advancing technology.
 Database Integrity: Since data is shared among multiple users, adequate integrity control measures must be maintained.Data

Features of DBMS
1. Data Dictionary / Repository
 Contains data definitions for a database:
Data Definition is the process of describing the properties of data to be included in a database table
 During data definition, each field is assigned:
Name (must be unique within the table)
Data type (such as Text, Number, Currency, Date/Time)
Properties (field size, format of the field, allowable range, if field is required, etc.)
 Finished specifications for a table become the table structure
 Ensures that data is according to the data definition rules
 Used for data access authorization (Password, etc) for database users

2. Utilities
 Programs used to maintain database
 Some of these programs are also used for backup and recovery of data
3. Query Language
A query is a request for specific data from the database
A query language consists of simple, English-like statements that allow users to
specify the data to display, print, store, update, or delete
Structured Query Language (SQL) is a popular query language that allows users
to manage, update, and retrieve data.

4. Report Generator / Report Writer REPORT


 Program that is used to generate reports DATA
 Retrieves data from database and displays it to the user in different formats DICTONARY GENERATOR
 Produce useful and attractive reports by using report generator

5. Access Security ACCESS


 Protection of database from unauthorized access UTILITIES FEATURES
 DBMS provides several procedures to maintain data security SECURITY
 Allowing access to the database through the use of username and
passwords
 Different users have different levels of access rights to database
 A data entry operator should only be allowed to enter data QUERY BACKUP &
 The chairman of the organization should be able to access or delete the LANGUAGE RECOVERY
data completely

6. Backup and Recovery


 DBMS Provides the facility of backup and recover
 Backup facility is used to store an additional copy of data
 Data can be recovered from backup file, if the original data file is lost or
damaged
Advantages of DBMS

Data Independence Redundancy Control Consistency Constraints


• Data and application • Redundancy means • Allows user to design complex
programs are separate from duplication of data in multiple data structures
each other files → causes wastage of • Enables users to view and
• User can change data storage storage access data in different ways
structure without changing • The data in database appears
the application program only once and is not
duplicated

Data Security Features: Backup / Recovery: Advanced Capabilities


Provide enhanced security Provide sophisticated backup / • Provides advance capabilities
mechanisms for access to recovery mechanism. Backup / • Online access - Access data
data. Data base security Recovery capabilities often through Internet
mechanisms typically go much distinguish between true DBMS
further in adding more and a software package that
extensive security features. only claims this facility.
Disdvantages of DBMS

High Cost of DBMS Higher Hardware Cost Appointing Technical Staff


Expensive to purchase database • • DBMS Software requires • Application programmers
management software e.g. powerful hardware to work require a sort of precise
Oracle properly and efficiently training to code efficient
• Requires Large Size memory programs that will run under a
and high processsor speed. DBMS.

Cost of Staff Training Problem in Wrong Database Need of Data Dictionary


Use training is required in all Environment • Useful tool but expensive
fields: A later change in structure, • Requires installation costs as
• Programming forced by changing well as hardware requirement
• Application development requirements, can be costly in
• Database administrator terms of conversion and testing
• Spend a lot amount for staff of existing programs.
training
CHAPTER 02
BASIC CONCEPTS AND
TERMINOLOGIES OF
DATABASE
Construction of Records & Files
Field:
 A combination of one or more characters
 Represents Smallest unit of data
 Name of each field in a record is unique
 Each field contains one specific piece of information
Record:
 A collection of related fields (facts about something) is called a record
 Treated as a single unit
File:
 A collection of related records used as single unit
 Files are stored on different storage media such as hard disk, USB flash drive
or optical disc (CDs and DVDs)

Table / Relation Formation


A table or relation is used to store information about an entity
An entity is anything about which information is stored in the database.
 An entity may have many attributes with unique name
 Each attribute must have one and only one value. RN. NAME CLASS SUBJECT
Example: 101 Ayesha 12TH Biology
STUDENT(Roll No, Name, Class, Subject)
102 Fatima 12TH Physics
103 Khadijah 12TH Computer
Properties of a Relation
A relation or table is the basics of relational databasemanagement system. RELATIONAL MODEL (RM) represents the database as a
collection of relations. A Relation has the following properties
Properties of Relation:
View:
1. No Duplicate Rows/Tuples Exits:  A view is virtual table that displays the data from
 Each tuple in a relation must be unique one or more tables
 It ensures that every tuple or row in relation is identified using the primary key  Same data of database can be viewed in different
 It must have one column or set of columns to uniquely identify each row or tuple ways
 in the relationDifferent people and processes must be able to use the same data at the  It is created by using SQL query
same time.
2. Insignificant Tuple/Row Sequence
Purpose
 keep data safe and secure from unauthorized and
 The sequence or position of tuples in a relation is also insignificant
illegal use
 The sequence may be changed
 It displays data according to user requirements
 The rows can be retrieved in any order
 A new tuple may be inserted at the beginning, at the end or in the middle of the
CREATE VIEW STUDENT_VIEW AS
relation
SELECT ROLLNO, NAME, CLASS, SUBJECT
3. Insignificant Attribute/Column Sequence FROM STUDENT
 The sequence of attributes in a relation is insignificant WHERE STUDENT_GENDER_CD = “M”;
 This sequence can be changed without changing the meaning or use of the relation
 The columns can be retrieved in any order
 The benefit of this property is that it enables many users to share the same table
 without concern of how the table is organized
4. Atomic Values in Columns/Attributes
 An entry at the intersection of each row and column is atomic(Value that cannot be
divided further is called atomic value)
 There can be only one valued in each attribute of a specific row or tuple
Keys and Its Types

Keys:
 Attribute or set of attributes that uniquely identifies a Primary Key Candidate Key Foreign Key Sort/Control Key
tuple in relation
 Defined in relations to access the stored data quickly and
efficiently
01 02 03 04 05 06 07
 Used to create relationship between different relations or
tables Composite Key Alternate Key Secondary Key
1. Primary Key:
Attribute or set of attributes that uniquely identifies a tuples in relation
Important points:
 A relation can have only one primary key
 Each value in primary key attribute must be unique
 Primary key cannot contain null values
2. Composite Key:
 A primary key that consists of two or more attributes
 Used in the situation where a single column is unable to uniquely
identify a record in a relation
3. Candidate Key:
 Candidate key is a attribute or set of attributes that can be used as
primary key
 A relation may contain many candidate keys
4. Alternate Key:
 The candidate key that are not selected as primary key
 •May be used for search unique values but not a Primary Key
Keys and Its Types

5. Foreign Key:
 A foreign key is an attribute or set of attributes in a relation whose values
match a primary key in another relation
 The relation in which foreign key is created is known as dependent
relation or child relation
 The relation to which the foreign key refers is known as parent relation
 The key connects to another relation when a relationship is established
between two relations
 A relation may contain many foreign keys
6. Secondary Key:
 Secondary key can be used to access/retrieve records
 Values may not be unique
 One secondary key may refer to many records
INDEXES:
Example of Secondary key  Data structure used by DBMS to speed up the sorting and
An attribute “City” in Student relation can be used to display all students who searching process
live in specific city  Indexes may be created on key(primary, secondary, foreign)
7. Sort/Control Key:  Created by System developer or DBA
 Sort key is set of attributes that is used to physically sequence the stored  Some indexes are created automatically in the related tables
data when relationship are defined
 Also known as control key  Indexes are stored in index file
 Stored data can be sorted in different ways according to the user  Indexes can slow down data entry and editing because index file
requirement is also updated each time data is added or modified
 An attribute “Name” in Student relation can be used as sort key to display
all students alphabetically by name
Role of Database Users

The Data Administrator


 Manages the organization's data. The End-Users
 Sets rules and requirements for
databases.  A user uses computers for specific needs

 Connects users with data staff. (entertainment, education, or


professional tasks).
 They may have moderate knowledge of
The Database Administrator 03
01 02 computers.

 Designs and manages databases.  They do not require in-depth technical

 Makes sure databases are secure expertise.

and work well.  Their focus is on using the software they


need.
CHAPTER 03
DATABASE DESIGN
PROCESS
Steps of Analysis Stage
Properties of Relation:
Feasibility Study:
 Conducted to investigate the required system Feasibility
 • Determines whether the proposed system is affordable, possible and acceptable Project Planning
Study
Requirement Analysis
 Collect the requirements for the project(proposed system) include:
 Possible inputs for database Data Analysis Requirement
 Required functionality of project Analysis
 The user describe their requirements and expectations from the proposed system3.
Project Planning
 Comprehensive planning and time schedule must be developed to complete project
 successfully
 Cost factors ( Hardware, Software, Salaries of Team ) are taken into consideration
Data Analysis
 Activities of data analysis
 Data Flow Diagrams
 Decision Tables
 Decision Trees
Entity
Data Modeling & Its Ingredients
Ingredients Attribute
Model - Representation of real world objects, events and their associations
Relationship
Data Modeling - Process of identifying data objects and relationship between them
• E-R model is a popular conceptual data mode
Data Modeling & Its Ingredients
Entity:
 Anything that is participating in the system is known as entity or object.
 It Can be person, place, thing or event
Examples TEACHER STUDENT
• Person: TEACHER,PLAYER
• Place: COUNTRY Entities
• Object: VEHICAL
• Event: REGISTRAION,SALE, PURCHASE
 Represented by rectangle in data model
 Name of entity is written inside the rectangle
Attribute: Gender
 Characteristics of an entity
 Entity may have many attributes Name Degree
Example
• Entity - TEACHER
• Attributes - Name, Gender, Last Degree, Appointment Date, Pay
Scale , Telephone etc.
 Represented by an oval in the data model TEACHER
 Name of entity is written inside the oval
Relationship: Attributes
 A logical connection between different entities
 Relationship indicates how the entities are related to each other
 All relationships are bi-directional
Example
• A TEACHER Teachs STUDENT(S) TEACHER STUDENT
Relationship
Data Modeling & Its Ingredients
Cardinality:
Maximum number of instances of one entity that can be associated with
each instance of another related entity. │ Line Means One
The cardinality can be one (1) or many
Cardinality One
 Indicates single instance of an entity Crow Feet Means Many
 Denoted by vertical line │ next to first entity or before second entity
Cardinality Many
 Indicates multiple instances of an entity
 Denoted by crow’s footAttribute:
Modality:
Minimum number of instances of one entity associated with each instance of
the related entity
The modality can be ‘0’ or ‘1’.
Modality ‘0’ (zero) → Optional Relationship Circle Means Optional
• The relationship is called optional when the minimum number is zero
• Denoted by small circle O, after cardinality symbol of first entity or before
cardinality symbol of second entity │ Line Means Mandatory
Modality ‘1’ (one) → Mandatory Relationship
• The relationship is called mandatory when the minimum number is one
• Denoted by small vertical line │, after cardinality symbol of first entity or
before cardinality symbol of second entity
Cardinality of Relationship
Cardinality of Relationship:
 One-To-One
 One-To-Many Cardinality: Cardinality:
 Many-To-Many Indicates That only one Indicates That there may
school is involved in the be many students in the
relationship school.

Modality: Mandatory Modality: Optional


Indicates That there must Indicates That there may be
be on schoolto have this no students in school
relationship
Types Of Relationship
One-to-One Relationship
This type of relationship is used when:
1. For each instance in first entity class: there is only one instance in the
second entity class
2. For each instance in second entity class: there is only one instance in
the first entity class
One-to-Many Relationship
This type of relationship is used when:
1. For each instance in first entity class: there can be many instances in
the second entity class
2. For each instance in second entity class: there is only one instance in
the first entity class.
Many-to-Many Relationship
This type of relationship is used when:
1. For each instance in first entity class: there can be many instances in
the second entity class
2. For each instance in second entity class: there can be many instances in
the first entity class
Database Design Process
1. Planning
Begins when customer requests to develop a database system
Consists of various activities
 Used to identify the resource needed to develop the system
 Also identifies the time limits for the completion of the system.
2. Analysis PHASES
Used to study the current system in detail
 Identifies how the current system works and
where the improvements are required Planning Analysis Database Design Implementation
 It includes a detailed study of various
operations performed in the system

3. Database Design: Physical Design Logic Design


 Logical Database Design:
Complete description of data to be stored in database
 Physical Database Design
 Translate the logical database design into physical storage structure
 Implement the database as a set of records, files, indexes etc.
4. Implementation
Database System is implemented after design and developed. It is installed and executed for the users by particular type of computer
like server. It also require network hardware if accessed at multiple locations. Users are provided authorization by managers
4. Normalize the Relations 1. Logical Database Structures
 The relations that are created in step (1) and (2) may have: Developed during logical database
• Unnecessary redundancy design such as normalized relations
• Anomalies (errors) may arise while updating relations
 Normalize the relations to avoid these problems
 Normalization is the process of producing a simpler and more reliable database
structure 2. User Processing Requirements
Includes size & frequency of use of
Physical Database Design database, response time, security,
backup, recovery etc.
Last stage of database design process
A process of mapping logical database structure into actual database structure:
3. Characteristics of the DBMS
 Set of records
 Files Includes characteristics of DBMS and
 Indexes etc. other components of computer operating
environment
Elements And Components of Physical Dtabase Design
1. Data Volume and Usage Analysis Major Inputs To Physical Dtabase Design
 Estimates of database size are used to select physical storage devices and
storage cost estimation
 Estimates of usage paths or patterns are used to select file organization and
access methods, plans for the use of indexes and strategy for data
distribution
Components of Logic Database Design
1. Represent Entities:
 Each Entity is represented as a relation in relational model
 The identifier of entity type becomes the primary key of relation Represent Entities
 The remaining attributes of the entity type become non-key attributes of the relation
 ER Model is one of the most widely used model for conceptual design
 ER Model is represented by ER Diagram Represent Relationship
2. Represent Relationships
 Each relationship in an ER diagram must be represented in relational model
 Depends upon nature of relationship Merge The Relations
 Represent a relationship by making primary key of one relation a foreign key of another
relation
 Create a separate relation to represent a relationship Normalize The Relations
3. Merge the Relations
 There may be redundant relations (Means two or more relations may describe the same
entity type)
 View integration is the process of merging relations to remove the redundancy
 Example:
 EMP1 (EmployeeID, Name, Address, Phone)
 EMP2 (EmployeeID, EmpName, Addr, Designation, DOB)
 The above relations EMP1 and EMP2 describe the same entity EMPLOYEE
 They can be merged into one relation
 EMP (EmployeeID, Name, Address, Phone, Designation, DOB)
Elements And Components of Physical Dtabase Design
2. Data Distribution Strategy
 For organization which uses distributed computing networks, there is
necessity
 to decide which nodes (or sites ) in the network to locate the data Centeralized Replicated
physically
 Data allocation or distribution – A process of deciding where to locate the 234
1

data
Data Distribution Strategies Partitioned Hybrid
1. Centralized: All data are located at a single site. It is fairly easy to do but it
has at least three disadvantages:
Data are not readily accessible at remote sites.
Data communication costs may be high.
The database system fails totally when the central system fails.
2. Partitioned: The database is divided into partitions (fragments). Each
partition is assigned to a particular site. Major advantage of this is that data is
moved closer to local users and so is more accessable.
3. Replicated: Full copy of database is assigned to more than one site in the
network. This approach maximizes local access but creates update problems,
since each database change must be reliably processed and synchronized at
all of the sites.
4.Hybrid: In this strategy, the database is partitioned into critical and non-
critical fragments. Non-critical fragments are stored at only one site, while
critical fragments are stored at multiple sites.
3. File Organization
 Technique for arranging the records of file on secondary storage devices
 System designer must recognize several constraints for selecting a file organization
 Physical characteristics of secondary storage devices
 Available operating systems & file management software
 User requirements for storing & accessing data
4. Indexes
 A separate table that contains organization of records for quick retrieval
 May be created on primary key, secondary key, foreign key etc.is partitioned into critical & non-
critical fragments
5. Integrity Constraints
 Data integrity means correctness and consistency of data
 Another form of database protection or security
 Integrity is related to the quality of data
 Maintained with help of integrity constraints
 Integrity constraints are rules designed to keep data consistent and correct
 These rules act like a check on the incoming data
Example
Fee of the student should not be greater than 10000
The ID should not be assigned to two or more employees

You might also like