Data Basis
Data Basis
DATA BASIS
Data And Information
Roll
No.
Data
Data is a collection of facts, figures and statistics - related to an object, that can be
processed to produce a meaningful information. Name DATA Class
Importance of Data
Used by managers to perform effective and successful operations of management Marks
Provides a view of past activities related to the rise and fall of an organization
Enables the organization to make better decision for future activities
RN. NAME CLASS SUBJECT
Information 101 Ayesha 12TH Biology
The manipulated and processed data is called information
e.g., the Marks of students results. It is an output of a certain process. 102 Fatima 12TH Physics
103 Khadijah 12TH Computer
Data Information
Data is raw facts & Statistics Processed form of data
1. Data Capturing:
Data Capturing
Data must be recorded or captured in some form before it can be processed.
2. Data Manipulation:
The following operations may then be performed on the gathered data. Data Manipulation
Activities
Classifying: Organizing data into classes /groups. Items may be assigned predetermined
codes, they can be numeric, alphabetic or alphanumeric.
Calculations: Arithmetic manipulation of the data. Managing The Output Results
Sorting: Data is arranged in logical sequence (numerically or alphabetically).
Summarizing: Masses of data are reduced to a more concise and usable form.
3. Managing The Output Results:
Once the data is captured & manipulated it may be:
Storing and Retrieval: Data is retained for future reference. Accessing / fetching the
stored data and / or information is the Retrieve Activity.
Communication and Reproduction: Data may be transferred from one location or
operation to another, for further processing. It is sometimes necessary to copy or to make
duplicate of data, called Reproduction.
Field, Record & File [Types]
Field:
A combination of one or more characters
Represents Smallest unit of data
Name of each field in a record is unique
Each field contains one specific piece of information
Record:
A collection of related fields (facts about something) is called a record
Treated as a single unit
File:
A collection of related records used as single unit
Files are stored on different storage media such as hard disk, USB flash drive
or optical disc (CDs and DVDs)
File Types [Usage Point of View] File Types [Functional Point of View]
Master File: These are the latest updated files which never Program Files: These files contain the software instructions
become empty, ever since they are created. They maintain i.e. source program files and executable files. The source
information that remains constant over a long period of program files may have the extension as .cpp and the
Transaction File: Files in which data prior to the stage of Data Files: These files contain data and are created by the
processing is recorded. It may be temporary file, retained till software being used. A few of these are: Word Processor
the master file is updated. .doc, .rtf (document), Spread Sheet .xls and .wks
Backup File: Permanent files, for the purpose of protection (worksheet), Video files .avi, .mpg etc.
of vital data.
Oragnization of Files
Duplication of data in The application program has to be Combined reports are very difficult to
changed if the format of file is display as data is scattered in
multiple files
changed. different files.
Personal loan
Updating Data Retrieving Data
Clients biographic information
5 6
Data Data
Data Integrity
1 2 3
Integration Independence
If a data item is
Information is When the format of
contained in more
coordinated from a file is changed,
than one file, then all
different files and then all the
files must be updated
operated on a single programs have to be
if that item is
file. changed. However, a
changed.
database allow
programs to be
modified without
reorganization of
data.
Components of Database
1. Data
Main purpose of database system is:
To store data DATA
To maintain data
To Process data
HARDWARE
2. Hardware
Physical components of computer
SOFTWARE
Used to perform different tasks such as input, output, storage and
processing
Example of Hardware components PERSONNEL
• Secondary storage
• I/O devices
• Processors
• Main memory
3. Software
Collection of programs used by the computer within database system 4. Personnel
DBMS: People related to the database system
Use to create and manage a database in database system Database Administrator (DBA)
Application Program Person who is responsible to manage the whole database system
Used to access and process the data stored in database Application Programmer
Operating System Person who writes the application program to access data from database
Manage all hardware components End Users
Enables all other software to run on the computer Persons who perform different operations on database
Access DBMS through Application program
Database Models
1. Hierarchical Model
Records are arranged in a hierarchy like an organizational chart E4
Each record type is called a node or segment E2
Node represents a particular entity E5
Topmost node is root E1
Use Parent /Child relationship
Each parent node can have many child nodes
E3 E6
Each child node may have only one parent node Hierarchical Model
One-to-many relationship between data entities
Kind of structure → Inverted tree
E1 E2
2. Network Model
Similar to hierarchical model but one difference E3 E4
A child node may have any number of parent nodes
Child nodes represented by arrows E5 E6
Complex diagram to represent a database
Provides more flexibility than hierarchical model Network Model
3. Relational Model E1 E4
Most commonly used database model
More flexible than hierarchical and network database model
Consists of a collection of simple relations or tables E2 E5
Relation represents a particular entity to store information about entity
Relationships are based on the data of the entities
Relationship between entities is represented by diagram E3 E6
Relational Model
DBMS (Database Management System)
The data management system (a collection of programs) which is used for storing and manipulating databases is called database management system
(DBMS). DBMS software (database manager) controls the overall structure of a database and access to the data itself.
Features of DBMS
1. Data Dictionary / Repository
Contains data definitions for a database:
Data Definition is the process of describing the properties of data to be included in a database table
During data definition, each field is assigned:
Name (must be unique within the table)
Data type (such as Text, Number, Currency, Date/Time)
Properties (field size, format of the field, allowable range, if field is required, etc.)
Finished specifications for a table become the table structure
Ensures that data is according to the data definition rules
Used for data access authorization (Password, etc) for database users
2. Utilities
Programs used to maintain database
Some of these programs are also used for backup and recovery of data
3. Query Language
A query is a request for specific data from the database
A query language consists of simple, English-like statements that allow users to
specify the data to display, print, store, update, or delete
Structured Query Language (SQL) is a popular query language that allows users
to manage, update, and retrieve data.
Keys:
Attribute or set of attributes that uniquely identifies a Primary Key Candidate Key Foreign Key Sort/Control Key
tuple in relation
Defined in relations to access the stored data quickly and
efficiently
01 02 03 04 05 06 07
Used to create relationship between different relations or
tables Composite Key Alternate Key Secondary Key
1. Primary Key:
Attribute or set of attributes that uniquely identifies a tuples in relation
Important points:
A relation can have only one primary key
Each value in primary key attribute must be unique
Primary key cannot contain null values
2. Composite Key:
A primary key that consists of two or more attributes
Used in the situation where a single column is unable to uniquely
identify a record in a relation
3. Candidate Key:
Candidate key is a attribute or set of attributes that can be used as
primary key
A relation may contain many candidate keys
4. Alternate Key:
The candidate key that are not selected as primary key
•May be used for search unique values but not a Primary Key
Keys and Its Types
5. Foreign Key:
A foreign key is an attribute or set of attributes in a relation whose values
match a primary key in another relation
The relation in which foreign key is created is known as dependent
relation or child relation
The relation to which the foreign key refers is known as parent relation
The key connects to another relation when a relationship is established
between two relations
A relation may contain many foreign keys
6. Secondary Key:
Secondary key can be used to access/retrieve records
Values may not be unique
One secondary key may refer to many records
INDEXES:
Example of Secondary key Data structure used by DBMS to speed up the sorting and
An attribute “City” in Student relation can be used to display all students who searching process
live in specific city Indexes may be created on key(primary, secondary, foreign)
7. Sort/Control Key: Created by System developer or DBA
Sort key is set of attributes that is used to physically sequence the stored Some indexes are created automatically in the related tables
data when relationship are defined
Also known as control key Indexes are stored in index file
Stored data can be sorted in different ways according to the user Indexes can slow down data entry and editing because index file
requirement is also updated each time data is added or modified
An attribute “Name” in Student relation can be used as sort key to display
all students alphabetically by name
Role of Database Users
data
Data Distribution Strategies Partitioned Hybrid
1. Centralized: All data are located at a single site. It is fairly easy to do but it
has at least three disadvantages:
Data are not readily accessible at remote sites.
Data communication costs may be high.
The database system fails totally when the central system fails.
2. Partitioned: The database is divided into partitions (fragments). Each
partition is assigned to a particular site. Major advantage of this is that data is
moved closer to local users and so is more accessable.
3. Replicated: Full copy of database is assigned to more than one site in the
network. This approach maximizes local access but creates update problems,
since each database change must be reliably processed and synchronized at
all of the sites.
4.Hybrid: In this strategy, the database is partitioned into critical and non-
critical fragments. Non-critical fragments are stored at only one site, while
critical fragments are stored at multiple sites.
3. File Organization
Technique for arranging the records of file on secondary storage devices
System designer must recognize several constraints for selecting a file organization
Physical characteristics of secondary storage devices
Available operating systems & file management software
User requirements for storing & accessing data
4. Indexes
A separate table that contains organization of records for quick retrieval
May be created on primary key, secondary key, foreign key etc.is partitioned into critical & non-
critical fragments
5. Integrity Constraints
Data integrity means correctness and consistency of data
Another form of database protection or security
Integrity is related to the quality of data
Maintained with help of integrity constraints
Integrity constraints are rules designed to keep data consistent and correct
These rules act like a check on the incoming data
Example
Fee of the student should not be greater than 10000
The ID should not be assigned to two or more employees