DBMS New File Record
DBMS New File Record
CHAPTER
Database Concepts
Data: Row facts figures, number, alphabets and symbols are called as Data.
Information: Processed, Stored, Transmitted data is known as Information.
Tuple/Record/Row: Single entry in a table is called as Tuple.
Field: Each column is identified by a distinct header called as Field.
Attribute: It is defined as named column of a table.
Table/Relation: Table is a collection of data elements organized in terms of rows
and columns.
Entity: Entity is a real world object which is represented in database.
Relationship: Relation between two database table is known as relationship.
Domain: It is defined as a set of allowed values for one or more attributes.
Degree: The number of attributes in a table.
Cardinality: The number of tuples in table.
Types of keys:
2. Composite key: Two or more Uniquely identified attribute in the table is know
as composite key.
3. Foreign key: One attribute in one table used to link/refer another table.
5. Super key: Super key is the key that can uniquely identify any row in a
database.
6. Candidate key: A candidate key is a subset of super key and ensure that there
are no duplicate record in a table.
Suma AJ Dept. of Computer Science Page 1
Database Concepts
Data Collection and Data Preparation: Data collection is the stage of data
processing cycle. Data need to be collected from different resources. Data must be
accurate and well defined.
Data preparation involves classifying the collected data to respective entities and
analyze the quality of gathered data.
Data Input: The collected data has to be given to the computer using any of the
input devices.
Data Processing: Data processing means transforming the input data into more
meaningful form using database queries.
Data output: The processed information is now presented to users in various
forms using any of the output device.
Data Storage: The processed data is stored for future use which can be accessed
or retrieved from any of the stored devices.
Types of Relationship:
1. One-to-One[1:1]:
In this type One entity is associated with only one entity.
2. One-to-Many[1:N]:
In this type one entity is associated with many entity.
3. Many-to-One[N:1]:
In this type many entity is associated with one entity.
4. Many-to-Many[N:N]:
In this type many entity is associated with many entity.
Advantage of DBMS:
1. Data Availability:
The data are made available to wide variety of users in a meaningful
format.
2. Data Integrity:
Data integrity refers to correctness of the data in the database.
3. Control’s data redundancy:
In DBMS environment if redundancy is present, then it can be controlled
by propagating updates in all the places where ever redundant data is
present.
4. Concurrent Access:
DBMS support to access the database parallelly for multiple users.
5. Data Persistence:
Data persistence means that in DBMS all data is maintained as long as it
is not deleted explicitly.
6. Reduced application development time:
It supports many important functions that are common to many
applications. This facility make quick development of applications.
7. Data Security:
In DBMS the only authorized users can access the data, so data are
secured in DBMS.
8. Data Sharing:
DBMS allows sharing of data by several users simultaneously.
9. Standardized Data:
DBMS can ensure that all the data follow the applicable standards and
stores data in particular standard format.
Disadvantages of DBMS:
1. Danger of a overload:
DBMS is not advisable for small and simple applications.
2. Complexity:
The database system creates additional requirements and complexity.
3. Qualified Users:
To use DBMS it requires professionally trained staff.
4. Lower in efficiency:
A database system is a multi-user software which is often less efficient
than specialized software which is produced, since it is developed by
multiple developer efficiency may vary.
5. Costs:
Through the use of DB system new costs are generated for the system
itself and also due to additional hardware requirements.
Database Users:
To design, to use, and to maintain the database many people are involved,
The people who interact with the database are called database users.
The database users include,
1. End users:
End users are those who use DB in order to query and update the DB
and generate reports.
2. System Analysts:
System analysts determine the requirement of end users, to create a
solution for their business need and focus on Non-technical and
technical aspects.
3. Application programmers:
These users are the developers of DB who design application program
using specifications given by the system analysts.
4. Database Administrators[DBA]:
DBA is a person who has central control over both data and
application. DBA’s gives the authorization access, schema definition,
modification, new system installation, security enforcement and
administration.
Suma AJ Dept. of Computer Science Page 7
Database Concepts
Advantages:
1. Simple and Continuous.
2. Easy to organize file incase of system failure.
3. Storage medium required are relatively cheap.
4. Errors in the file remain localized to a particular data block.
5. Very efficient when most of the records be processed.
Disadvantages:
1. Entire disk must be processed even if a single file to be searched.
2. Overall processing is slow.
3. Data redundancy is relatively high.
4. Need to sort the files.
Data Independence:
The capacity to change data at one level does not affect the data at another
level.
There are two types of data independence,
1. Logical data independence:
Logical data independence is a capacity to change the data in
Database Architecture:
The design of a database management system highly depended on its
architecture.
The whole system is divided according to the related modules.
1) 1-Tier architecture:
In 1-tier architecture the user directly interacts with the database and
uses it.
2) 2-Tier architecture:
The 2-Tier architecture of database is based on a client-server model.
The client which can be a user or an application program, client
request is conveyed to the server, the server process the request and
returns an repose to the client.
3) 3-Tier architecture:
The 3-Tier architecture is most widely used architecture for web
application.
Intermediate layer is called application server or web server stores
the web connectivity software and the business logic part of
application used to access the right amount of data from the right
amount of data from the database server.
This layer acts like medium for sending partially processed data
between the database server and the client.
Data Models:
It is a collection of conceptual tools which describes the structure of data ,
operations performed, and Relationship between the data.
Advantages:
1. Simplicity
2. Data Security
3. Data Integrity
4. Efficiency
Disadvantages:
1. Implementation complexity.
2. Database management problem.
3. Lack of Structural Independence.
4. Operational anomalies.
Advantages:
1. It is simple and easy to implement.
2. It can handle many relationships within the organization.
3. It has better data independence.
Disadvantages:
1. More complex system of database structure.
2. Lack of structural dependence.
Advantages:
1. It is extremely simple and easy to implement.
2. It has a strong mathematical foundation.
3. It has been highly standardizes.
4. Powerful, Flexible and easy to use query capability.
Suma AJ Dept. of Computer Science Page 13
Database Concepts
Disadvantages:
1. Needs comparatively powerful hardware.
2. It hides the implementation complexities and the physical data storage
details.
Data mining:
Data mining refers to extracting or mining knowledge/data from large
amount of data.
Phases or stages of data mining are:
1. Selection: Selecting the data according to some criteria.
2. Preprocessing: Data cleaning stage i.e the data is reconfigured to ensure
a consistent format.
3. Transformation: The data is not transferred, but transformed.
4. Interpretation and evolution: Identified pattern are interpreted into
knowledge which can be used to support human decision making.
Generalization:
Generalization is a process in which number of different entities are
combined to form a single entity.
Specialization:
Specialization is a process in which a group of entities is divided into sub
group based on their characteristics.
Relational Algebra:
Relational algebra is a procedural query language that consists of a set of
operation that takes one or more relations as input and results into a new
relation as output.
(marks) = 70 [STUDENT]
Result:
Reg
Name Marks Subject
no.
1002 XYZ 70 Computer Science
1003 PQR 70 Computer Science
1004 AAD 70 Computer Science
b. PROJECT operation:
The project operation is used to select some required attributes which
satisfies given condition from a Table while discarding the other
attributes.
The symbol pi(π) is used as project operation.
Ex:
STUDENT
Reg
Name Marks Subject
no.
1001 ABC 80 Computer Science
1002 XYZ 70 Computer Science
1003 PQR 70 Computer Science
1004 AAD 70 Computer Science
Name Marks
ABC 80
XYZ 70
PQR 70
AAD 70
AAB 85
ACD 35
ADC 40
2. Binary operations:
The operations operating on two relations are known as binary operation.
a. Cartesian Product operation:
The Cartesian product of two relations R and S is denoted by R X S
defines a new relation which is the concatenation of each tuple of
relation R with each tuple of relation S.
It is denoted by symbol X
Ex:
EMPLOYEE JOB
JID JOB
EID NAME JID 1J Tester
1E ABC J1001 2J Manager
2E XYZ J1005
3E PQR J1020
b. Union operation:
The union operation combines both relation R and S eliminating
duplicate rows. It is denoted by symbol X
Ex:
CRICKET FOOT BALL
ROLL ROLL
NAME AGE NAME AGE
NO. NO.
1001 ABC J1001 1004 LMN J1004
1005 GHI J1006
1002 XYZ J1005
1003 PQR J1020
1003 PQR J1020
ROLL ROLL
NAME AGE NAME AGE
NO. NO.
1001 ABC J1001 1004 LMN J1004
1002 XYZ J1005 1005 GHI J1006
1003 PQR J1020 1003 PQR J1020
ROLL ROLL
NAME AGE NAME AGE
NO. NO.