Database Management
Database Management
Data:
Information:
Database: The database is a collection of inter-related data which is used to retrieve, insert and delete the data efficiently. It is also used to organize the
data in the form of a table, schema, views, and reports, etc. For example: The college Database organizes the data about the admin, staff, students and
faculty etc.
Database Management System: Database management system is a software which is used to manage the database. For example: MySQL, Oracle,
MS Access etc are a very popular commercial database which is used in different applications.
Data hierarchy refers to the systematic organization of data, often in a hierarchical form. Data organization involves characters, fields, records, files and
so on.
Components of the Data Hierarchy
• A data field holds a single fact or attribute of an entity. Consider a date field, e.g. "19 September 2004". This can be treated as a single date field
(e.g. birthdate), or three fields, namely, day of month, month and year.
• A record is a collection of related fields. An Employee record may contain a name field(s), address fields, birthdate field and so on.
• A file is a collection of related records. If there are 100 employees, then each employee would have a record (e.g. called Employee Personal Details
record) and the collection of 100 such records would constitute a file (in this case, called Employee Personal Details file).
• Files are integrated into a database. This is done using a Database Management System. If there are other facets of employee data that we wish
to capture, then other files such as Employee Training History file and Employee Work History file could be created as well.
Illustration of the data hierarchy
Purpose of Database Systems
• The collection of data, usually referred to as the database , contains information relevant to an enterprise.
• The primary goal of a DBMS is to provide a way to store and retrieve database information that is both convenient and efficient.
• A general-purpose DBMS is designed to allow the definition, creation, querying, update, and administration of databases.
Advantages of DBMS
• Controls database redundancy: It can control data redundancy because it stores all the data in one single database file and that recorded
data is placed in the database.
• Data sharing: In DBMS, the authorized users of an organization can share the data among multiple users.
• Easily Maintenance: It can be easily maintainable due to the centralized nature of the database system.
• Reduce time: It reduces development time and maintenance need.
• Backup: It provides backup and recovery subsystems which create automatic backup of data from hardware and software failures and
restores the data if required.
• Multiple user interface: It provides different types of user interfaces like graphical user interfaces, application program interfaces
Disadvantages of DBMS
• Cost of Hardware and Software: It requires a high speed of data processor and large memory size to run DBMS software.
• Size: It occupies a large space of disks and large memory to run them efficiently.
• Complexity: Database system creates additional complexity and requirements.
• Higher impact of failure: Failure is highly impacted the database because in most of the organization, all the data stored in a single
database and if the database is damaged due to electric failure or database corruption then the data may be lost forever.
File System
File-processing system is supported by a conventional operating system. The system stores permanent records in various files, and it needs
different application programs to extract records from, and add records to the appropriate files. Before database management systems
(DBMSs) were introduced, organizations usually stored information in such systems.
Disadvantage of file system:
• Data isolation
• Integrity problems
• Atomicity problems
• Security problems
Data redundancy and inconsistency: Since different programmers create the files and application programs over a long period, the various files
are likely to have different structures and the programs may be written in several programming languages. Moreover, the same information may
be duplicated in several places (files). For example, if a student has a double major (say, music and mathematics) the address and telephone
number of that student may appear in a file that consists of student records of students in the Music department and in a file that consists of
student records of students in the Mathematics department. This redundancy leads to higher storageand access cost. In addition, it may lead
to data inconsistency; that is, the various copies of the same data may no longer agree. For example, a changed student address may be reflected
in the Music department records but not elsewhere in the system.
Difficulty in accessing data: Suppose that one of the university clerks needs to find out the names of all students who live within a particular
postal-code area. The clerk asks the data-processing department to generate such a list. Because the designers of the original system did not
anticipate this request, there is no application program on hand to meet it. There is, however, an application program to generate the list of all
students. The university clerk has now two choices: either obtain the list of all students and extract the needed information manually or ask a
programmer to write the necessary application program. Both alternatives are obviously unsatisfactory. Suppose that such a program is written,
and that, several days later, the same clerk needs to trim that list to include only those students who have taken at least 60 credit hours. As
expected, a program to generate such a list does not exist. Again, the clerk has the preceding two options, neither of which is satisfactory. The
point here is that conventional file-processing environments do not allow needed data to be retrieved in a convenient and efficient manner.
More responsive data-retrieval systems are required for general use.
Data isolation: Because data are scattered in various files, and files may be in different formats, writing new application programs to retrieve the
appropriate data is difficult.
Integrity problems: The data values stored in the database must satisfy certain types of consistency constraints. Suppose the university maintains an
account for each department, and records the balance amount in each account. Suppose also that the university requires that the account balance of
a department may never fall below zero. Developers enforce these constraints in the system by adding appropriate code in the various application
programs. However, when new constraints are added, it is difficult to change the programs to enforce them. The problem is compounded when
constraints involve several data items from different files.
Atomicity problems: A computer system, like any other device, is subject to failure. In many applications, it is crucial that, if a failure occurs, the data
be restored to the consistent state that existed prior to the failure. Consider a program to transfer $500 from the account balance of department A
to the account balance of department B. If a system failure occurs during the execution of the program, it is possible that the $500 was removed from
the balance of department A but was not credited to the balance of department B, resulting in an inconsistent database state. Clearly, it is essential
to database consistency that either both the credit and debit occur, or that neither occur. That is, the funds transfer must be atomic—it must happen
in its entirety or not at all. It is difficult to ensure atomicity in a conventional file-processing system.
Security problems: Not every user of the database system should be able to access all the data. For example, in a university, payroll personnel need
to see only that part of the database that has financial information. They do not need access to information about academic records. But, since
application programs are added to the file-processing system in an adhoc manner, enforcing such security constraints is difficult.
DBMS vs File System
Some application of DBMS
Enterprise Information:
• Sales: For customer, product, and purchase information.
• Accounting : For payments, receipts, account balances, assets and other Accounting information.
• Human resources: For information about employees, salaries, payroll taxes, and for generation of paychecks.
• Manufacturing: For management of the supply chain and for tracking production of items in factories, inventories of items in warehouses and
stores, and orders for items.
• Online retailers: For sales data noted above plus online order tracking, generation of recommendation lists, and maintenance of online product
evaluations. Banking and Finance:
• Banking: For customer information, accounts, loans, and banking transactions.
• Credit card transactions : For purchases on credit cards and generation of monthly statements.
• Finance: For storing information about holdings, sales, and purchases of financial instruments such as stocks and bonds; also for storing real-
time market data to enable online trading by customers and automated trading by the firm.
Universities : For student information, course registrations, and grades (in addition to standard enterprise information such as
• Airlines: For reservations and schedule information. Airlines were among the first to use databases in a geographically distributed manner.
Telecommunication: For keeping records of calls made, generating monthly bills, maintaining balances on prepaid calling cards, and storing information
about the communication networks.
View of Data: A major purpose of a database system is to provide users with an abstract view of the data. That is, the system hides certain details of
how the data are stored and maintained.
Data Abstraction: For the system to be usable, it must retrieve data efficiently. The need for efficiency has led designers to use complex data structures
to represent data in the database. Since many database-system users are not computer trained, developers hide the complexity from users through
several levels of abstraction, to simplify users’ interactions with the system:
Levels of Abstraction
• View level: application programs hide details of data types. Views can also hide
information (such as an employee’s salary) for security purposes.
View of Data
Entity-Relationship Model
Relational Model: The relational model uses a collection of tables to represent both data and the relationships among those data. Each table has
multiple columns, and each column has a unique name. Tables are also known as relations. The relational model is an example of a record-based
model. Record-based models are so named because the database is structured in fixed-format records of several types. Each table contains records of
a particular type. Each record type defines a fixed number of fields, or attributes. The columns of the table correspond to the attributes of the record
type. The relational data model is the most widely used data model.
Entity-Relationship Model: The entity-relationship (E-R) data model uses a collection of basic objects, called entities, and relationships among these
objects. An entity is a “thing” or “object” in the real world that is distinguishable from other objects.
Object-Based Data Model: Object-oriented programming (especially in Java, C++, or C#) has become the dominant software- development
methodology. This led to the development of an object-oriented data model that can be seen as extending the E-R model with notions of
encapsulation, methods (functions), and object identity. The object-relational data model combines features of the object-oriented data model and
relational data model.
Semistructured Data Model: The semistructured data model permits the specification of data where individual data items of the same type may
have different sets of attributes. This is in contrast to the data models mentioned earlier, where every data item of a particular type must have the
same set of attributes. The Extensible Markup Language (XML) is widely used to represent semistructured data.
Database Languages
Database Languages are used to create and maintaindatabase on computer. There are large numbers of database language Softwares like
Oracle, My SQL, MS Access, dBase, FoxPro etc.
• It is a language that provides a set of operations to support the basic data manipulation operations on the data held in the databases. It allows
users to insert, update, delete and retrieve data from the database. The part of DML that involves data retrieval is called a query language.
• The following table gives an overview about the usage of DML statements in SQL:
Relation
erived ttribute
1. Entity:
An entity way be any Object, class, person or place. In the ER diagram, an entity can be represented as i ectangles.
Considei- an organization as an example- manager, prod act, em ployee, department etc. ca i› be taken as an entity.
a. Weak EnEity
An ertity tl›at depends Di› anOtl›er ertity called a weak entity. The ›'/eak entig dOesi»'t contain any key attribute Of its D‹’/r. The
‹’leak entity is ieyresented by a double redangle.
instalment
2. Ahribute
The artribute is used to describe the pr operty of an entity. EcIip se is used I o i epresent an attribute.
rur ex«i›tyIe, id, age, ton\ac I umber. name, etc can be attr ibutes of a st‹i.tent.
,
& Iq z@6+eisvsa8 rqreedde man Jaareistirs 4‹netop. 2 repeerb z pm»yf4. Tie @a@évb b
b. Compodte Al1ribute
D attri&te that comp‹›sed ‹›f many otLer altribvtes is lane n as a ‹omprs'M attribute. The ca«ipos'tc att‹ibuk is representaJ bt ari allipsa, md D zlllpses are
cmnected mid an ellipse.
c. Hulbvalued Attribute
All attribute can haYe fTlere tbaa one value These a\tributes are knwr as a mulfivalued ad ibvte. The double oYaJ is used to represent multlvalued arlzibuLe.
for exampler a student ‹an have more than one phone number
Phone no.
d. Denved AttribUEg
For example, A person's age changes over time and can be derived from another attribute like Date oF birth,
Roll no.
3. Relationship
A re!ati0nship is used to describe the relation between entities. Diamond or rhombus is used IO I”Rpresent the relationship.
a. One-to-One RelatiOl1slzip
WhRn only one instance of an entity is asso iBtRd with the relationship, then it is knor/n as one to one relationship. For example, A fen›alR can mai”ry
\Vhen only one instance of the. entity an the left, and FflOI E' than one instance. of an Eintity on the i ip ht associates with the relationship then this is kno ’in as a
one-to-many relationship.
For exainple, Scientist can invent many inventions, but the invention is done by the only specific scientist.
c. Many-to-one relationship
\Vhen wore than one instance of the entity on the left, and only one instance. of an Eintity on the rig let associates with the
relationship then it is kno ’in as a many-to-one relationship.
For exa inple, Student eni olls for only one course, but a course. can have many students.
d. Nany-ID-lTlaily relationship
\Vh90 m0re than and instance 0f the entity on the left, and more than one instanC0 0f an entity aa the right associates ›’/ith the
relationship then it is kno›’/n as a many-to-many relationship.
For example s Employee can assign by ma ry projects and project can have ma ry employees.
Relational Database Management System
A relational database consists of a collection of tables, each of which is assigned a unique name. For example, consider the instructor table of Figure2.1,
which stores information about instructors. The table has four column headers: ID, name, dept_name, and salary. Each row of this table records
information about an instructor, consisting of the instructor’s ID, name, dept_name, and salary.
A row in the table can be thought of as representing the relationship between a specified ID and the corresponding values for name, dept_name, and
salary values. In general, a row in a table represents a relationship among a set of values. Since a table is a collection of such relationships, there is a close
correspondence between the concept of table and the mathematical concept of relation, from which the relational data model takes its name. In
mathematical terminology, a tuple is simply a sequence (or list) of values. A relationship between n values is represented mathematically by an n-tuple of
values, i.e., a tuple with n values, which corresponds to a row in a table. Thus, in the relational model the term relation is used to refer to a table, while
the term tuple is used to refer to a row. Similarly, the term attribute refers to a column of a table.
We use the term relation instance to refer to a specific instance of a relation, i.e., containing a specific set of rows. The instance of instructor shown in
Figure 2.1 has 12 tuples, corresponding to 12 instructors.
For each attribute of a relation, there is a set of permitted values, called the domain of that attribute. Thus, the domain of the salary attribute of the
instructor relation is the set of all possible salary values, while the domain of the name attribute is the set of all possible instructor names.
What is table
The RDBMS database uses tables to store data. A table is a collection of related data entries and contains ro 'is and colum ns to store data.
Ajeet 2P B.Tech
Mahesh 21 bCA
Ratan 2Z MCA
26
What is field
Field is a smaller entity 0f the table ›'/hich CoF!tains speCific information about every i”ec0rd in the? tB ble. In the abDvl? example?, the field in the student
table consist of id, name, age, course.
Whal is row or record
A ro ' of a table is also called I”RCOI d. It contains the specific inf0rmatiOR 0f 6Bch individual enti y in the table. It is a horizontal entity in the? tgblR. FOF Example:
The above table contains 3 records.
1 Ajeet 24 B.Tech
Whaf is column
A column is a vertical entity in the table which contains all information associated With a specific field in a table. For example:
"i aI€El" is a column in the? above table which contains all information about studl?nt's name.
NULL Values: The NULL value of the table specifies that the field has been left blank during record creation. It is totally different from the value filled
with zero or a field that contains space.
Data Integrity:
There are the following categories of data integrity exist with each RDBMS:
• Domain integrity: It enforces valid entries for a given column by restricting the type, the format, or the range of values.
• Referential integrity: It specifies that rows cannot be deleted, which are used by other records.
• User-defined integrity: It enforces some specific business rules that are defined by users. These rules are different from entity, domain or
referential integrity.
Difference between DBMS and RDBMS
When we talk about a database, we must differentiate between the database schema, which is the logical design of the database, and the database
instance, which is a snapshot of the data in the database at a given instant in time.
Let us consider our university database example. Each course in a university may be offered multiple times, across different semesters, or even within a
semester. We need a relation to describe each individual offering, or section, of the class. The schema is
section (course_id, sec_id, semester, year, building, room_number, time_slot_id) Figure shows a sample instance of the section relation.
section (course id, sec id, semester, year, builcJin, r rri number, time slat id)
F inure shows a sample instance of the section relation.
Types key:
J.3uperKe/
Scper leyis » set of an attribute éck c»n xiquely iden5fy a tupb. Super ley is a superset of a candidate ley.
? 9XdlTt§I9: In tte »dove EkDOYEE table, fa(EhPL0EE_ID, RPL0YEE_IAME) the rane of No 9 D\995 CZ0 b9 9 5ZN6,
dut heif Eh9tYEE_I0 Câf\’t b9 ke s»me. Hence, k)5 CO J0f\ Câft Bt5O #9 â f9\.
a It is the first key which is used to idElntify OnEl and only one instance of an entity uniquElly. AT e.ntI CBS COntain multiple
keys as we saw in PERSON table. The key whlCh is most suitable from those lists become a primary key.
o In the EMPLOYEE table, ID can be primary key sinCe it is unique for each employee. In the EMPLOYEE table, we can even
sele.ct LicensR_Number and passport_Number as primary key since they are also unique. a
For eaCh entity, sEllection of the primary kEl$ IS based on requirElment and developElrs.
2. Candidate key
a A candidate key is an attribute or set of an attribute which can uniquely identify a tuple.
a The remaining attributes except for primary key are considered as a candidate key. The candidate keys are as strong as the primary key.
For example: In the EMPLOYEE table, id is best suited for the primary key. Rest of the attributes like SSN, Passport_Number, and License_Number, etc.
are considered as a candidate key.
EMPLOYEE
Candidate Key
4. Foreign key
o Foreign keys are the column of the table which is used to point to the primary key of another table .
In a Company, every employee storks in a speCific department, and employee and department are two different entities . So
we can't store the information of the department in the employee table. That's why we link these trio tables through the
primary key of one table .
u We add the primary key of the DEPARTMENT table, Departnient_ld as a new attribute in the EMPLOYEE table .
o Now in the EMPLOYEE table, Department_ld is the foreign key, and both the tables are related.
Database Design for a University Organization
To illustrate the design process, let us examine how a database for a university could be designed. The initial specification of user requirements may be
based on interviews with the database users, and on the designer’s own analysis of the organization. The description that arises from this design phase
serves as the basis for specifying the conceptual structure of the database. Here are the major characteristics of the university.
• The university is organized into departments. Each department is identified by a unique name (dept name), is located in a particular building, and has
a budget.
• Each department has a list of courses it offers. Each course has associated with it a course id, title, dept name, and credits, and may also have have
associated prerequisites.
• Instructors are identified by their unique ID. Each instructor has name, associated department (dept name), and salary.
• Students are identified by their unique ID. Each student has a name, an associated major department (dept name), and tot cred (total credit hours the
student earned thus far).
A database schema, along with primary key and foreign key dependencies, can be depicted by schema diagrams. Figure 2.8 shows the schema
diagram for our university organization. Each relation appears as a box, with the relation name at the top in blue, and the attributes listed inside the
box. Primary key attributes are shown underlined. Foreign key dependencies appear as arrows from the foreign key attributes of the referencing
relation to the primary key of the referenced relation.
• The university maintains a list of classrooms, specifying the name of the building, room number, and room capacity.
• The university maintains a list of all classes (sections) taught. Each section is identified by a course id, sec id, year, and semester, and has
associated with it a semester, year, building, room number, and time slot id (the time slot when the class meets).
• The department has a list of teaching assignments specifying, for each instructor, the sections the instructor is teaching.
• The university has a list of all student course registrations, specifying, for each student, the courses and the associated sections that the
student has taken (registered for).
Areal university database would be much more complex than the preceding design. However we use this simplified model to help you understand
conceptual ideas without getting lost in details of a complex design.
A primary goal of a database system is to retrieve information from and store new information into the database. People who work with a database can
be categorized as database users or database administrators.
Database Users and User Interfaces: There are four different types of database-system users, differentiated by the way they expect to interact with the
system. Different types of user interfaces have been designed for the different types of users.
1. Naive Users:
• Naive Users are those users who need not be aware of the presence of the database system or any other system supporting their usage.
• Naive users are end users of the database who work through a menu driven application program, where the type and range of response is always
indicated to the user.
2. Online Users :
• Online users are those who may communicate with the database directly via an online terminal or indirectly via a user interface and application
program.
• These users are aware of the presence of the database system and may have acquired a certain amount of expertise with in the limited interaction
permitted with a database.
3. Sophisticated Users :
• Such users interact with the system without writing programs.
• Instead, they form their requests in database query language. Each query is submitted to a processor whose function is to breakdown DML statement
into instructions that the storage manager understands
4. Application Programmers :
• Professional programmers are those who are responsible for developing application programs or user interface.
• The application programs could be written using general purpose programming language or the commands available to manipulate a database.
Database Administrator:
• The database administrator (DBA) is the person or group in charge for implementing the database system within an organization.
• The DBA has all the system privileges allowed by the DBMS and can assign (grant) and remove (revoke) levels of access (privileges) to and from
other users.
• DBA is also responsible for the evaluation, selection and implementation of DBMS package.