0% found this document useful (0 votes)
30 views37 pages

Cs409 Notes

Uploaded by

Horain khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views37 pages

Cs409 Notes

Uploaded by

Horain khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 37

Cs409 Notes

What is Data and Information?

Data are the raw bits and pieces of information with no context.

Anything which is not necessarily meaningful to human being.

Raw facts/Un-processed information.

Data are the raw bits and pieces of information with no context. If I told you, “45, 32,
41, 75,” you would not have learned anything.

Data can be quantitative or qualitative. Quantitative data is numeric, the result of a


measurement, count, or some other mathematical calculation. Qualitative data is
descriptive. “Dark Brown” the color of hair, is an example of qualitative data. A
number can be qualitative too: if I tell you my favorite number is 5, that is qualitative
data because it is descriptive, not the result of a measurement or mathematical
calculation.

Information

Data Processed to reveal its meaning

Information is meaningful

In today’s world, accurate, relevant, and timely information is the key to good
decision making

Good decision making is key to survival in today’s competitive and global


environment

By itself, data is not that useful. To be useful, it needs to be given context. Returning
to the example above, if I told you that “45, 32, 41, 75” are the numbers of students
that had registered for upcoming classes, that would be information. By adding the
context – that the numbers represent the count of students registering for specific
classes – I have converted data into information.

Once we have put our data into context, aggregated and analyzed it, we can use it to
make decisions for our organization. We can say that this consumption of information
produces knowledge. This knowledge can be used to make decisions, set policies, and
even spark innovation.
What is DBMS and its examples?

Database Management System (DBMS) is a software for storing and retrieving users’ data
while considering appropriate security measures. It consists of a group of programs which
manipulate the database. The DBMS accepts the request for data from an application and
instructs the operating system to provide the specific data. In large systems, a DBMS helps users
and other third-party software to store and retrieve data.

DBMS allows users to create their own databases as per their requirement. The term “DBMS”
includes the user of the database and other application programs. It provides an interface
between the data and the software application.

Example of DBMS

 Oracle
 IBM DB2
 Ingress
 Teradata
 MS SQL Server
 MS Access
 MySQL

2-Tier Database Architecture


In two-tier, the application logic is either buried inside the User Interface on the client or within
the database on the server (or both). With two-tier client/server architectures, the user system
interface is usually located in the user’s desktop environment and the database management
services are usually in a server that is a more powerful machine that services many clients.

3-Tier Database Architecture


In three-tier, the application logic or process lives in the middle-tier, it is separated from the data
and the user interface. Three-tier systems are more scalable, robust and flexible. In addition, they
can integrate data from multiple sources. In the three-tier architecture, a middle tier was added
between the user system interface client environment and the database management server
environment. There are a variety of ways of implementing this middle tier, such as transaction
processing monitors, message servers, or application servers.

Forms

Forms are used for entering, modifying, and viewing records. You likely have had to fill out
forms on many occasions, like when visiting a doctor's office, applying for a job, or registering
for school. The reason forms are used so often is that they're an easy way to guide people toward
entering data correctly. When you enter information into a form in Access, the data goes exactly
where the database designer wants it to go in one or more related tables.

Forms make entering data easier. Working with extensive tables can be confusing, and when you
have connected tables, you might need to work with more than one at a time to enter a set of
data. However, with forms it's possible to enter data into multiple tables at once, all in one place.
Database designers can even set restrictions on individual form components to ensure all of the
needed data is entered in the correct format. All in all, forms help keep data consistent and
organized, which is essential for an accurate and powerful database.

Reports

Reports offer you the ability to present your data in print. If you've ever received a computer
printout of a class schedule or a printed invoice of a purchase, you've seen a database report.
Reports are useful because they allow you to present components of your database in an easy-to-
read format. You can even customize a report's appearance to make it visually appealing. Access
offers you the ability to create a report from any table or query.

Queries

Queries are a way of searching for and compiling data from one or more tables. Running a
query is like asking a detailed question of your database. When you build a query in Access, you
are defining specific search conditions to find exactly the data you want.

Queries are far more powerful than the simple searches you might carry out within a table. While
a search would be able to help you find the name of one customer at your business, you could
run a query to find the name and phone number of every customer who's made a purchase within
the past week. A well-designed query can give information you might not be able to find just by
looking through the data in your tables.

DBMS Languages

A Software Package that enables users to define, create, maintain, and control access to the
database.

Data Definition Language (DDL)

Data Manipulation Language (DML)

Control access (DCL):

Security, integrity, concurrent access, recovery, support for data communication, etc.

Utility services (DCL)

File import/export, monitoring facilities, code generator, report writer, etc.


Support Ad Hoc queries

Roles/ Jobs in the Database Environment

 Data Administrator (DA)


 Database Administrator (DBA)
 Database Designers (Logical and Physical)
 Applications Programmer
 Database Developer
 Database Analyst
 End users (Naive and sophisticated)

The final step up the information ladder is the step from knowledge to wisdom. We
can say that someone has wisdom when they can combine their knowledge and
experience to produce a deeper understanding of a topic. It often takes many years to
develop wisdom on a particular topic and requires patience.

What is database?

A database is a collection of related data. By data, we mean known facts that can be
recorded and that have implicit meaning. For example, consider the names, telephone
numbers, and addresses of the people you know. You may have recorded this data in
an indexed address book or you may have stored it on a hard drive, using a personal
computer and software such as Microsoft Access or Excel. This collection of related
data with an implicit meaning is a database.

Interaction of user with database

Any user can interact with the database. For example, application programmers and end user.

As its name shows, application programmers are the one who writes application programs that
uses the database. These application programs are written in programming languages like
COBOL or PL (Programming Language 1), Java and fourth generation language. These
programs meet the user requirement and make according to user requirements. Retrieving
information, creating new information and changing existing information is done by these
application programs. They interact with DBMS through DML (Data manipulation language)
calls. And all these functions are performed by generating a request to the DBMS. If application
programmers are not there, then there will be no creativity in the whole team of Database.
End users are those who access the database from the terminal end. They use the developed
applications, and they don’t have any knowledge about the design and working of database.
These are the second class of users, and their main motto is just to get their task done.

Form Processing & Report Processing Applications are built by using VB, DOT Net

or PHP programming. Reports can be developed by using Crystal Reports tool.

Query Processing can be managed by using vendors SQL tool or 3rd party tools such

As TOAD, SQL Developer etc.

Multiuser DBMS Architecture

 Teleprocessing

One computer with a single CPU and a number of terminals.

Processing performed within the same physical computer. User terminals are
typically “dumb”, incapable of functioning on their own, and cabled to the central
computer

 File-Server

In a file-server environment, the processing is distributed about the network,


typically a local area network (LAN).

File-server is connected to several workstations across a network

Database resides on file-server.

DBMS and applications run on each workstation

 Client-Server (2-tiers)

In relational database management systems (RDBMSs), many of which started as


centralized systems, the system components that were first moved to the client
side were the user interface and application programs. Because SQL provided a
standard language for RDBMSs, this created a logical dividing point between
client and server. Hence, the query and transaction functionality related to

SQL processing remained on the server side. In such an architecture, the server is

often called a query server or transaction server because it provides these two

functionalities. In an RDBMS, the server is also often called an SQL server.


The user interface programs, and application programs can run on the client side.

When DBMS access is required, the program establishes a connection to the


DBMS (Which is on the server side); once the connection is created, the client
program can communicate with the DBMS. A standard called Open Database
Connectivity (ODBC) provides an application programming interface (API),
which allows client-side programs to call the DBMS, as long as both client and
server machines have the necessary software installed. Most DBMS vendors
provide ODBC drivers for their systems. A client program can actually connect to
several RDBMSs and send query and transaction requests using the ODBC API,
which are then processed at the server sites. Any query results are sent back to the
client program, which can process and display the results as needed. A related
standard for the Java programming language, called JDBC, has also been defined.

This allows Java client programs to access one or more DBMSs through a
standard interface.

The different approach to two-tier client/server architecture was taken by some

object-oriented DBMSs, where the software modules of the DBMS were divided

between client and server in a more integrated way. For example, the server level

may include the part of the DBMS software responsible for handling data storage
on

disk pages, local concurrency control and recovery, buffering and caching of disk

pages, and other such functions. Meanwhile, the client level may handle the user

interface; data dictionary functions; DBMS interactions with programming


language compilers; global query optimization, concurrency control, and recovery

across multiple servers; structuring of complex objects from the data in the
buffers;

and other such functions. In this approach, the client/server interaction is more

tightly coupled and is done internally by the DBMS modules—some of which


reside

on the client and some on the server—rather than by the users/programmers. The

exact division of functionality can vary from system to system. In such a


client/server architecture, the server has been called a data server because it
provides data in disk pages to the client. This data can then be structured into
objects

for the client programs by the client-side DBMS software.

The architectures described here are called two-tier architectures because the
software components are distributed over two systems: client and server. The
advantages of this architecture are its simplicity and seamless compatibility with
existing

systems. The emergence of the Web changed the roles of clients and servers,
leading

to the three-tier architecture.

 Client server (3-tier)

Many Web applications use an architecture called the three-tier architecture,


which

adds an intermediate layer between the client and the database server, this
intermediate layer or middle tier is called the application server or the Web

server, depending on the application. This server plays an intermediary role by


running application programs and storing business rules (procedures or
constraints)

that are used to access data from the database server. It can also improve database

security by checking a client’s credentials before forwarding a request to the


database server. Clients contain GUI interfaces and some additional application-
specific

business rules. The intermediate server accepts requests from the client, processes

the request and sends database queries and commands to the database server, and

then acts as a conduit for passing (partially) processed data from the database
server

to the clients, where it may be processed further and filtered to be presented to


users

in GUI format. Thus, the user interface, application rules, and data access act as
the
three tiers. Figure 2.7(b) shows another architecture used by database and other

application package vendors. The presentation layer displays information to the

user and allows data entry. The business logic layer handles intermediate rules
and

constraints before data is passed up to the user or down to the DBMS. The bottom

layer includes all data management services. The middle layer can also act as a
Web

server, which retrieves query results from the database server and formats them
into

dynamic Web pages that are viewed by the Web browser at the client side.

Other architectures have also been proposed. It is possible to divide the layers

between the user and the stored data further into finer components, thereby giving

rise to n-tier architectures, where n may be four or five tiers. Typically, the
business

logic layer is divided into multiple layers. Besides distributing programming and

data throughout a network, n-tier applications afford the advantage that any one

tier can run on an appropriate processor or operating system platform and can be

handled independently. Vendors of ERP (enterprise resource planning) and CRM

(customer relationship management) packages often use a middleware layer,


which accounts for the front-end modules (clients) communicating with a number
of back-end databases (servers).

Paradigm of Database System

 Logical/ Conceptual Database Design


o How should the transactional data be modeled (OLTP)?
o Designing the data warehouse schema (OLAP)
 Query Processing
o Analysis queries are hard to answer efficiently
o What techniques are available to the DBMS?
 Physical Database Design (DBA)
o How should the data be organized on disk?
o What data structures should be used?
 Data Mining
o What use is all this data?
o Which questions should we ask our data warehouse OLAP?
 Data Science/Big Data

A conceptual data model is a model that helps to identify the highest-level relationships between
the different entities, while a logical data model is a model that describes the data as much detail
as possible, without regard to how they will be physically implemented in the database.

Query Processing is a translation of high-level queries into low-level expression. It is a step wise
process that can be used at the physical level of the file system, query optimization and actual
execution of the query to get the result. It requires the basic concepts of relational algebra and
file structure. It refers to the range of activities that are involved in extracting data from the
database. It includes translation of queries in high-level database languages into expressions that
can be implemented at the physical level of the file system. In query processing, we will actually
understand how these queries are processed and how they are optimized.

Physical database design is the process of transforming logical data models into physical data
models. An experienced database designer will make a physical database design in parallel with
conceptual data modeling if they know the type of database technology that will be used.

DBMS Three Levels Architecture

In this architecture, schemas can be defined at the following three levels:

1. The internal level has an internal schema, which describes the physical storage structure of the
database. The internal schema uses a physical data model

and describes the complete details of data storage and access paths for the

database.

2. The conceptual level has a conceptual schema, which describes the structure of the whole
database for a community of users. The conceptual schema

hides the details of physical storage structures and concentrates on describing entities, data types,
relationships, user operations, and constraints.

Usually, a representational data model is used to describe the conceptual

schema when a database system is implemented. This implementation conceptual schema is


often based on a conceptual schema design in a high-level

data model.
3. The external or view level includes a number of external schemas or user

views. Each external schema describes the part of the database that a particular user group is
interested in and hides the rest of the database from that

user group. As in the previous level, each external schema is typically implemented using a
representational data model, possibly based on an external

schema design in a high-level data model.

The three-schema architecture is a convenient tool with which the user can visualize

the schema levels in a database system. Most DBMSs do not separate the three levels

completely and explicitly but support the three-schema architecture to some extent.

Some older DBMSs may include physical-level details in the conceptual schema.

The three-level ANSI architecture has an important place in database technology

development because it clearly separates the users’ external level, the database’s conceptual
level, and the internal storage level for designing a database. It is very much

applicable in the design of DBMSs, even today. In most DBMSs that support user

views, external schemas are specified in the same data model that describes the

conceptual-level information (for example, a relational DBMS like Oracle uses SQL

for this). Some DBMSs allow different data models to be used at the conceptual and

external levels. An example is Universal Data Base (UDB), a DBMS from IBM,

which uses the relational model to describe the conceptual schema, but may use an

object-oriented model to describe an external schema.

Notice that the three schemas are only descriptions of data; the stored data that

actually exists is at the physical level only. In a DBMS based on the three-schema

architecture, each user group refers to its own external schema. Hence, the DBMS

must transform a request specified on an external schema into a request against the
conceptual schema, and then into a request on the internal schema for processing

over the stored database. If the request is a database retrieval, the data extracted

from the stored database must be reformatted to match the user’s external view. The

processes of transforming requests and results between levels are called mappings.

These mappings may be time-consuming, so some DBMSs—especially those that

are meant to support small databases—do not support external views. Even in such

systems, however, a certain amount of mapping is necessary to transform requests

between the conceptual and internal levels.

Data Independence

The three-schema architecture can be used to further explain the concept of data

independence, which can be defined as the capacity to change the schema at one

level of a database system without having to change the schema at the next higher

level. We can define two types of data independence:

Logical data independence is the capacity to change the conceptual schema

without having to change external schemas or application programs. We

may change the conceptual schema to expand the database (by adding a

record type or data item), to change constraints, or to reduce the database

(by removing a record type or data item).

Physical data independence is the capacity to change the internal schema

without having to change the conceptual schema. Hence, the external

schemas need not be changed as well. Changes to the internal schema may be

needed because some physical files were reorganized—for example, by creating additional
access structures—to improve the performance of retrieval or

update. If the same data as before remains in the database, we should not
have to change the conceptual schema.

Generally, physical data independence exists in most databases and file environments where
physical details such as the exact location of data on disk, and hardware details of storage
encoding, placement, compression, splitting, merging of

records, and so on are hidden from the user. Applications remain unaware of these

details. On the other hand, logical data independence is harder to achieve because it

allows structural and constraint changes without affecting application programs—a

much stricter requirement.

Whenever we have a multiple-level DBMS, its catalog must be expanded to include

information on how to map requests and data among the various levels. The DBMS

uses additional software to accomplish these mappings by referring to the mapping

information in the catalog. Data independence occurs because when the schema is

changed at some level, the schema at the next higher level remains unchanged; only

the mapping between the two levels is changed. Hence, application programs referring to the
higher-level schema need not be changed.

The three-schema architecture can make it easier to achieve true data independence, both
physical and logical. However, the two levels of mappings create an

overhead during compilation or execution of a query or program, leading to inefficiencies in the


DBMS. Because of this, few DBMSs have implemented the full three schema architecture.

Database schema

A database schema is a blueprint or architecture of how our data will look. It doesn’t hold data
itself, but instead describes the shape of the data and how it might relate to other tables or
models. An entry in our database will be an instance of the database schema. It will contain all of
the properties described in the schema.

Schema types

There are two main database schema types that define different parts of the schema: logical and
physical.
A logical database schema represents how the data is organized in terms of tables. It also
explains how attributes from tables are linked together. Different schemas use a different syntax
to define the logical architecture and constraints.

To create a logical database schema, we use tools to illustrate relationships between components
of your data. This is called entity-relationship modeling (ER Modeling). It specifies what the
relationships between entity types are.

The physical database schema represents how data is stored on disk storage. In other words, it
is the actual code that will be used to create the structure of your database. In MongoDB with
mongoose, for instance, this will take the form of a mongoose model. In MySQL, you will use
SQL to construct a database with tables.

Schema objects

A schema is a collection of schema objects. Examples of schema objects include tables, views,
sequences, synonyms, indexes, clusters, database links, procedures, and packages. This chapter
explains tables, views, sequences, synonyms, indexes, and clusters.

Schema objects are logical data storage structures. Schema objects do not have a one-to-one
correspondence to physical files on disk that store their information. However, Oracle stores a
schema object logically within a tablespace of the database. The data of each object is physically
contained in one or more of the tablespace's datafiles. For some objects such as tables, indexes,
and clusters, you can specify how much disk space Oracle allocates for the object within the
tablespace's datafiles.

Who is DBA?

 A database administrator (DBA) is a person or group of persons who maintains a


successful database environment by directing or performing all related activities to keep
the data secure. The top responsibility of a DBA professional is to maintain data integrity.
This means the DBA will ensure that data is secure from unauthorized access but is
available to users.
 DBA job requires a high level of expertise by a person or group of persons.
 There are very rare chances that only a single person can manage all the database system
activities, so companies always have a group of people who take care of database system.

DBA & Databases

 Each database requires at least one database administrator (DBA).


 A Database system can be large and can have many users.
 A database administration is sometimes not a one-person job, but a job for a group of
DBAs who share responsibility.

Network Administrator
 A network administrator is a person designated in an organization whose responsibility
includes maintaining computer infrastructures with emphasis on local area networks
(LANs) up to wide area networks (WANs). Responsibilities may vary between
organizations, but installing new hardware, on-site servers, enforcing licensing
agreements, software-network interactions, as well as network integrity/resilience, are
some of the key areas of focus.
 Network administrator coordinates with the DBA for database connections and other
issues such as storage, OS and hardware.
 Some sites have one or more network administrators. A network administrator, for
example, administers Oracle networking products, such as Oracle Net Services.

Application Developers

 Designing and developing the database application


 Designing the database structure for an application
 Estimating storage requirements for an application
 Specifying modifications of the database structure for an application
 Relaying this information to a database administrator
 Tuning the application during development
 Establishing security measures for an application during development
 Database Server Programming using Oracle PL/SQL

DBA’s Tasks

 Task 1: Evaluate the Database Server Hardware


 Task 2: Install the Oracle Database Software
 Task 3: Plan the Database
 Task 4: Create and Open the Database
 Task 5: Back Up the Database
 Task 6: Enroll System Users
 Task 7: Implement the Database Design
 Task 8: Back Up the Fully Functional Database
 Task 9: Tune Database Performance
 Task 10: Download and Install Patches
 Task 11: Roll Out to Additional Hosts

DBA’s Responsibilities

 Installing and upgrading the Oracle Database server and application tools
 Allocating system storage and planning future storage requirements for the database
system
 Creating primary database storage structures (tablespaces) after application developers
have designed an application
 Creating primary objects (tables, views, indexes) once application developers have
designed an application
 Modifying the database structure, as necessary, from information given by application
developers
 Enrolling users and maintaining system security
 Ensuring compliance with Oracle license agreements
 Controlling and monitoring user access to the database
 Monitoring and optimizing the performance of the database
 Planning for backup and recovery of database information
 Maintaining archived data on tape
 Backing up and restoring the database
 Contacting Oracle for technical support
 Physical database design is the process of transforming logical data models into
physical data models. An experienced database designer will make a physical database
design in parallel with conceptual data modeling if they know the type of database
technology that will be used.
 Purposes
 Meeting the expectations of Database Designer for the database, following are two main
purposes of Physical Database Design for a DBA.
 Managing Storage Structure for database or DBMS
 Performance & Tuning
 Factor (A): Analyzing the database queries and transactions
 Before undertaking the physical database design, we must have a good idea of the
intended use of the database by defining in a high-level form the queries and transactions
that are expected to run on the database. For each retrieval query, the following
information about the query would be needed:
 1. The files that will be accessed by the query.
 2. The attributes on which any selection conditions for the query are specified.
 3. Whether the selection condition is an equality, inequality, or a range condition.
 4. The attributes on which any join conditions or conditions to link multiple
 tables or objects for the query are specified.
 5. The attributes whose values will be retrieved by the query.
 The attributes listed in items 2 and 4 above are candidates for the definition of access
structures, such as indexes, hash keys, or sorting of the file.
 For each update operation or update transaction, the following information would be
needed:
 1. The files that will be updated.
 2. The type of operation on each file (insert, update, or delete).
 3. The attributes on which selection conditions for a delete or update are specified.
 4. The attributes whose values will be changed by an update operation.
 Again, the attributes listed in item 3 are candidates for access structures on the
files,because they would be used to locate the records that will be updated or deleted.
On the other hand, the attributes listed in item 4 are candidates for avoiding an
access structure, since modifying them will require updating the access structures.
 Factor (B): Frequency with Queries and Transactions
 Besides identifying the characteristics of expected retrieval queries and update
transactions, we must consider their expected rates of invocation. This
 frequency information, along with the attribute information collected on each query and
transaction, is used to compile a cumulative list of the expected frequency
 of use for all queries and transactions. This is expressed as the expected frequency of
using each attribute in each file as a selection attribute or a join attribute, over all the
queries and transactions. Generally, for large volumes of processing, the informal 80–20
rule can be used: approximately 80 percent of the processing is accounted for by only 20
percent of the queries and transactions. Therefore, in practical situations, it is rarely
necessary to collect exhaustive statistics and invocation
 rates on all the queries and transactions; it is sufficient to determine the 20 percent or so
most important ones.
 Factor (C): Time constraints of queries & transactions
 Some queries and transactions may have stringent performance constraints. For
example,a transaction may have the constraint that it should terminate within 5 seconds
on 95 percent of the occasions when it is invoked, and that it should never take more than
20 seconds. Such timing constraints place further priorities on the attributes that are
candidates for access paths. The selection attributes used by queries and transactions with
time constraints become higher-priority candidates for primary access structures for the
files because the primary access structures are generally the most efficient for locating
records in a file.
 Factor (D): Expected frequencies of update operations
 A minimum number of access paths should be specified for a file that is frequently
updated,because updating the access paths themselves slows down the update operations.
For example, if a file that has frequent record insertions has 10 indexes on 10
different attributes, each of these indexes must be updated whenever a new record is
inserted.The overhead for updating 10 indexes can slow down the insert operations.
 Factor (E): Uniqueness constraints on attributes
 Access paths should be specified on all candidate key attributes—or sets of attributes—
that are either the primary key of a file or unique attributes. The existence of an index (or
other access path) makes it sufficient to only search the index when checking this
uniqueness constraint, since all values of the attribute will exist in the leaf nodes of the
index. For example, when inserting a new record, if a key attribute value of the new
record already exists in the index, the insertion of the new record should be rejected,
since it would violate the uniqueness constraint on the attribute. Once the preceding
information is compiled, it is possible to address the physical
 database design decisions, which consist mainly of deciding on the storage structures and
access paths for the database files.
 Design Decisions about Indexing.
 The attributes whose values are required in equality or range conditions (selection
operation) are those that are keys or that participate in join conditions (join operation)
requiring access paths, such as indexes.
 The performance of queries largely depends upon what indexes or hashing schemes exist
to expedite the processing of selections and joins. On the other hand, during
 insert, delete, or update operations, the existence of indexes adds to the overhead.This
overhead must be justified in terms of the gain in efficiency by expediting
 queries and transactions.The physical design decisions for indexing fall into the following
categories:
 1. Whether to index an attribute. The general rules for creating an index on an attribute
are that the attribute must either be a key (unique), or there must be some query that uses
that attribute either in a selection condition (equality or range of values) or in a join
condition. One reason for creating multiple indexes is that some operations can be
processed by just scanning the indexes, without having to access the actual data file.
 2. What attribute or attributes to index on. An index can be constructed on a single
attribute, or on more than one attribute if it is a composite index. If multiple attributes
from one relation are involved together in several queries,(for example,
(Garment_style_#, Color) in a garment inventory database), a Multi attribute (composite)
index is warranted. The ordering of attributes within a multiattribute index must
correspond to the queries. For instance, the above index assumes that queries would be
based on an ordering of colors within a Garment_style_# rather than vice versa.
 3. Whether to set up a clustered index. At most, one index per table can be a primary or
clustering index, because this implies that the file be physically
 ordered on that attribute. In most RDBMSs, this is specified by the keyword CLUSTER.
(If the attribute is a key, a primary index is created, whereas a clustering index is created
if the attribute is not a key.) If a table requires several indexes, the decision about which
one should be the primary or clustering index depends upon whether keeping the
table ordered on that attribute is needed. Range queries benefit a great deal
from clustering. If several attributes require range queries, relative benefits must be
evaluated before deciding which attribute to cluster on. If a query is to be answered by
doing an index search only (without retrieving data records), the corresponding index
should not be clustered, since the main benefit of clustering is achieved when retrieving
the records themselves. A clustering index may be set up as a multi attribute index if
range retrieval by that composite key is useful in report creation (for example, an index
on Zip_code, Store_id, and Product_id may be a clustering index for sales data).
 4. Whether to use a hash index over a tree index. In general, RDBMSs use B+- trees
for indexing. However, ISAM and hash indexes are also provided in some systems (see
Chapter 18). B+-trees support both equality and range queries on the attribute used as the
search key. Hash indexes work well with equality conditions, particularly during joins to
find a matching record(s), but they do not support range queries
 5. Whether to use dynamic hashing for the file. For files that are very volatile—that is,
those that grow and shrink continuously.

Database Tuning and Goal

The process of continuing to revise/adjust the physical database design by monitoring resource
utilization as well as internal DBMS processing to reveal bottlenecks such as contention for the
same data or devices.

The goals of tuning are as follows:

 To make applications run faster.


 To improve (lower) the response time of queries and transactions.
 To improve the overall throughput of transactions.
Tuning Indexes

The initial choice of indexes may have to be revised for the following reasons:

 Certain queries may take too long to run for lack of an index.
 Certain indexes may not get utilized at all.
 Certain indexes may undergo too much updating because the index is on an attribute that
undergoes frequent changes.

Most DBMSs have a command or trace facility, which can be used by the DBA to ask

the system to show how a query was executed—what operations were performed in

what order and what secondary access structures (indexes) were used. By analyzing

these execution plans, it is possible to diagnose the causes of the above problems.

Some indexes may be dropped and some new indexes may be created based on the

tuning analysis.

The goal of tuning is to dynamically evaluate the requirements, which sometimes

fluctuate seasonally or during different times of the month or week, and to reorganize

the indexes and file organizations to yield the best overall performance. Dropping and building
new indexes is an overhead that can be justified in terms of performance improvements.
Updating of a table is generally suspended while an index is dropped or created; this loss of
service must be accounted for. Besides dropping or creating indexes and changing from a
nonclustered to a clustered index and vice versa, rebuilding the index may improve performance.
Most RDBMSs use B+-trees for an index. If there are many deletions on the index key, index
pages may contain wasted space, which can be claimed during a rebuild operation. Similarly,

too many insertions may cause overflows in a clustered index that affect performance.
Rebuilding a clustered index amounts to reorganizing the entire table ordered on that key.

The available options for indexing and the way they are defined, created, and reorganized varies
from system to system. As an illustration, consider the sparse and dense indexes. A sparse index
such as a primary index will have one index pointer for each page (disk block) in the data file; a
dense index such as a unique secondary index will have an index pointer for each record. Sybase
provides clustering indexes as sparse indexes in the form of B+-trees, whereas INGRES provides
sparse clustering indexes as ISAM files and dense clustering indexes as B+-trees. In some
versions of Oracle and DB2, the option of setting up a clustering index is limited to a dense
index (with many more index entries), and the DBA has to work with this limitation.

Overview of Database Tuning


Statistics obtained from monitoring:

Storage statistics: Data about allocation of storage into tablespaces, index spaces,

and buffer pools.

I/O and device performance statistics: Total read/write activity (paging) on

disk extents and disk hot spots.

Query/transaction processing statistics: Execution times of queries and

transactions, and optimization times during query optimization.

Locking/logging related statistics: Rates of issuing different types of locks,

transaction throughput rates, and log records activity

Index statistics: Number of levels in an index, number of noncontiguous

leaf pages, and so on.

Statistics internally collected in DBMSs:

Size of individual tables

Number of distinct values in a column

The number of times a particular query or transaction is submitted/executed in an interval of time

The times required for different phases of query and transaction processing

Problems in Tuning

Tuning a database involves dealing with the following types of problems:

 How to avoid excessive lock contention, thereby increasing concurrency among


transactions.
 How to minimize the overhead of logging and unnecessary dumping of data.
 How to optimize the buffer size and scheduling of processes.
 How to allocate resources such as disks, RAM, and processes for most efficient
utilization.

Most of the previously mentioned problems can be solved by the DBA by setting

appropriate physical DBMS parameters, changing configurations of devices, changing


operating system parameters, and other similar activities. The solutions tend to

be closely tied to specific systems. The DBAs are typically trained to handle these

tuning problems for the specific DBMS.

Changes in Database Design

Existing tables may be joined (denormalized) because certain attributes

from two or more tables are frequently needed together: This reduces the

normalization level from BCNF to 3NF, 2NF, or 1NF.5

■ For the given set of tables, there may be alternative design choices, all of

which achieve 3NF or BCNF.We illustrated alternative equivalent designs. One normalized
design may be replaced by another.

■ A relation of the form R(K,A, B, C, D, ...)—with K as a set of key attributes—

that is in BCNF can be stored in multiple tables that are also in BCNF—for

example, R1(K, A, B), R2(K, C, D, ), R3(K, ...)—by replicating the key K in each

table. Such a process is known as vertical partitioning. Each table groups sets of attributes that
are accessed together. For example, the table EMPLOYEE(Ssn, Name, Phone, Grade, Salary)
may be split into two tables: EMP1(Ssn, Name, Phone) and EMP2(Ssn, Grade, Salary). If the
original table has a large number of rows (say 100,000) and queries about phone numbers and
salary information are totally distinct and occur with very different frequencies, then this
separation of tables may work better.

■ Attribute(s) from one table may be repeated in another even though this creates

redundancy and a potential anomaly. For example, Part_name may be

replicated in tables wherever the Part# appears (as foreign key), but there

may be one master table called PART_MASTER(Part#, Part_name, ...) where

the Partname is guaranteed to be up-to-date.

■ Just as vertical partitioning splits a table vertically into multiple tables,

horizontal partitioning takes horizontal slices of a table and stores them as


distinct tables. For example, product sales data may be separated into ten

tables based on ten product lines. Each table has the same set of columns

(attributes) but contains a distinct set of products (tuples). If a query or

transaction applies to all product data, it may have to run against all the

tables and the results may have to be combined.

Tuning Queries

 Indications for tuning queries


o A query issues too many disk accesses
o The query plan shows that relevant indexes are not being used.

Some typical instances of situations prompting query tuning include the following:

1. Many query optimizers do not use indexes in the presence of arithmetic

expressions (such as Salary/365 > 10.50), numerical comparisons of attributes

of different sizes and precision (such as Aqty = Bqty where Aqty is of type

INTEGER and Bqty is of type SMALLINTEGER), NULL comparisons (such as

Bdate IS NULL), and substring comparisons (such as Lname LIKE ‘%mann’).

2. Indexes are often not used for nested queries using IN; for example, the following

query:

SELECT Ssn FROM EMPLOYEE

WHERE Dno IN ( SELECT Dnumber FROM DEPARTMENT

WHERE Mgr_ssn = ‘333445555’ );

may not use the index on Dno in EMPLOYEE, whereas using Dno = Dnumber

in the WHERE-clause with a single block query may cause the index to be

used.

3. Some DISTINCTs may be redundant and can be avoided without changing


the result. A DISTINCT often causes a sort operation and must be avoided as

much as possible.

4. Unnecessary use of temporary result tables can be avoided by collapsing

multiple queries into a single query unless the temporary relation is needed

for some intermediate processing.

5. In some situations involving the use of correlated queries, temporaries are

useful. Consider the following query, which retrieves the highest paid

employee in each department:

SELECT Ssn

FROM EMPLOYEE E

WHERE Salary = SELECT MAX (Salary)

FROM EMPLOYEE AS M

WHERE M.Dno = E.Dno;

This has the potential danger of searching all of the inner EMPLOYEE table M

for each tuple from the outer EMPLOYEE table E. To make the execution

more efficient, the process can be broken into two queries, where the first

query just computes the maximum salary in each department as follows:

SELECT MAX (Salary) AS High_salary, Dno INTO TEMP

FROM EMPLOYEE

GROUP BY Dno;

SELECT EMPLOYEE.Ssn

FROM EMPLOYEE, TEMP

WHERE EMPLOYEE.Salary = TEMP.High_salary


AND EMPLOYEE.Dno = TEMP.Dno;

6. If multiple options for a join condition are possible, choose one that uses a

clustering index and avoid those that contain string comparisons. For example,

assuming that the Name attribute is a candidate key in EMPLOYEE and

STUDENT, it is better to use EMPLOYEE.Ssn = STUDENT.Ssn as a join condition

rather than EMPLOYEE.Name = STUDENT.Name if Ssn has a clustering

index in one or both tables.

7. One idiosyncrasy with some query optimizers is that the order of tables in

the FROM-clause may affect the join processing. If that is the case, one may

have to switch this order so that the smaller of the two relations is scanned

and the larger relation is used with an appropriate index.

8. Some query optimizers perform worse on nested queries compared to their

equivalent unnested counterparts. There are four types of nested queries:

 Uncorrelated subqueries with aggregates in an inner query.


 Uncorrelated subqueries without aggregates.
 Correlated subqueries with aggregates in an inner query.
 Correlated subqueries without aggregates.

Of the four types above, the first one typically presents no problem, since

most query optimizers evaluate the inner query once. However, for a query

of the second type, such as the example in item 2, most query optimizers

may not use an index on Dno in EMPLOYEE. However, the same optimizers

may do so if the query is written as an unnested query. Transformation of

correlated subqueries may involve setting temporary tables. Detailed examples

are outside our scope here.

9. Finally, many applications are based on views that define the data of interest
to those applications. Sometimes, these views become overkill, because a

query may be posed directly against a base table, rather than going through a

view that is defined by a JOIN.

Concepts of Keys

 A key is a combination of one or more columns that is used to identify rows in a relation
 A key in DBMS is an attribute or a set of attributes that help to uniquely identify a tuple
(or row) in a relation (or table). Keys are also used to establish relationships between the
different tables and columns of a relational database. Individual values in a key are called
key values.
 A composite key is a key that consists of two or more columns

Super Keys

 A super key is a combination of columns that uniquely identifies any row within a
relational database management system (RDBMS) table.

{Roll#}, {Roll#, NIC#}, {NIC#}, {Roll#, FirstName, LastName}

{Roll#, FirstName, LastName, Address, City, NIC#, Deptno}

Super Keys Example - 1

Now we have the following as super keys

Super Keys Example - 2

Following table consists of four columns

EmployeeID, Name, Job, DeptID

 Examples of superkeys in this table would be {employeeID, Name}, {employeeID,


Name, job}, and
{employeeID, Name, job, departmentID}

In a real database we don't need values for all of those columns to identify a row
 We only need, per our example, the set {EmployeeID}.
o This is a minimal superkey
o So, employeeID is a candidate key.
o EmployeeID can also uniquely identify the tuples.

Candidate Key Definition:

 A candidate key is a key that determines all of the other columns in a relation
 Candidate key columns help in searching fewer duplicated or unique records.

Examples

Product can be anything like biscuits, stationery, and books etc.

In PRODUCT relation

Prod# is a candidate key

Prod_Name is also a candidate key

In ORDER_PROD

(OrderNumber, Prod#) is a candidate key

(Candidate key can be called as Alternate Key)

Candidate Key Example Invoice (Buying Items)

 Items are sold to Customers


 Customer should buy Items with Invoice
 Same items can be sold on many invoices
 Invoice# and Item# will identifying exact record, what other columns are required
 If someone tells you inv# and Qty, can you find exact product

Therefore, different candidate keys are used in different organization e.g,

For a Bank as an Enterpise: AccountHolder (or Customer)

ACC#, Fname, Lname, DOB, CNIC#, Addr, City, TelNo, Mobile#, DriveLic#

For PTCL: Customer (single telno holders)

Consumer#, Fname, Lname, DOB, CNIC#, Addr, City, TelNo, Mobile#

For NADRA: Citizen (CNIC#, Fname, Lname, FatherName, DOB, OldCNIC#, PAddr, PCity,
TAddr, TCity, TelNo, Mobile#)
Reading Content:

Candidate key is used for the searching purposes in the logical and conceptual database system.

Candidate Key Examples

Example-1: Branch (branch-name, assets, branch-city)

Candidate Key: {branch-name}

Example-2: Customer (cname, natid, address, city, telno)

cname, address, city can be duplicated individually and cannot determine a record.

The following combinations distinguish customer records or tuples.

{cname, telno}, {natid}, {natid, cname}

As {natid}  {natid, cname}, then {natid, cname} is not candidate key and {natid} is a
candidate key

Example-3: Employee(empno, name, birth_date, address, city, telno, citizenship_id)

empno, telno, citizenship_id are possible candidate keys.

Exercises

Course (cid, cname, deptno)

Semester (sid, syear, startdate)

ClassAllocation (sid, cid, sec#, building#, room#)

Identify candidate keys in each of the above relations

Answer:

Candidate key from Course table: {cid}, {cname}

Candidate key from Semester table: {sid}

Candidate key from ClassAllocation table: { sid, cid, sec#}

Candidate Key for Search purpose


DBA can create Index on searching attributes (candidate Keys)

Primary Key Definition:

A primary key is a candidate key selected as the primary means of identifying rows in a
relation:

Characteristics of Primary Key:

 There is one and only one primary key per relation


 The primary key is NOT NULL, UNIQUE
 The ideal primary key is short, numeric(alpha), indexed, fixed length and never changes
 Key that has its ownership of an Enterprise
 The primary key may be a composite key

Alternate Definition of Primary key:

A primary key is a minimal identifier (means maybe 2 or 3 columns will gives you uniqueness)
that is used to identify tuples uniquely.

This means that no subset of the primary key is sufficient to provide unique identification of
tuples.

NULL values is not allowed in primary key attribute.

Primary key - Example & Issues:

We will now discuss how to identify the primary key in deferent examples.

Primary key - Student

Example-1:

STUDENT(StuID, FirstName, FamilyName, DOB, …)

Primary Key - Building

Example-2: Building (B#, BName, Location, Region),

B# is a primary key. Although, BName is unique, not null but it is not short.

Primary Key - Customer

Example-3: Customer (cname, citizenid, address, city, telno)


This relation indicates the information about personal details. There is a chance that cname is
duplicated, some may have citizenid and telno as null. This forces us to introduce new a attribute
such as cust# that would be a primary key.

Customer (cust#, cname, citizenid, address, city, telno)

Primary Key - Bank

Example-4:

BankBranch(Branch-Name, City, TotalAsset)

 What will be a PK that you will suggest?


o BranchNo
 What is a candidate key?
o BranchNo
o Branch-Name
 Which attribute is unique?
o BranchNo
o Branch-Name
 There may be many branches in one city then finalize this relation with possible
constraints

In this topic we are going to discuss more example because we need to analyze further new
issues from the real life that how we can decide a well formatted and well-organized kind of
primary keys.

Primary key Instances:

As we all know, different organizations utilize various primary keys to express their views. For
example,

 512235665 Social Security Number


 610112654 (By Birth Asian, European, Homeland)
 200338900 Student ID (With Registration year)
 LHR051 Product ID (with Manufacturing Loc
 31266885 Bank Acc# (followed by Branch Code)
 005384 Company has many branch codes 005
 007004 invoices issued from branches

Primary key Fixed Length:


Another example of fixed length primary key:

Primary key – Indexing:

Indexing is a way to optimize the performance of a database by minimizing the number of disk accesses
required when a query is processed. It is a data structure technique which is used to quickly locate and
access the data in a database. DBMS automatically set the indexing on primary key.

In this topic we have discuss different formats and styles of the primary key.

Basics of Indexing:

 Paging a key’s data to access fast.


 It may include sorting from A to Z or Z to A.
 Indexing is applied on non-key columns where mostly search is required

Primary key – Indexing on DBMS:

This figure indicates that when indexing is used, data is accessed more quickly than when
indexing is not used.

Primary key – Who will decide?

Roll# is issued by some University or college in which these students are studying.

Format or instance ( Shape of Roll# ), meaningful column for PK

Surrogated Key Definition:

A surrogate key as an artificial column added to a relation to serve as a primary key:

 DBMS supplied
 Short, numeric and never changes – an ideal primary key!
 Has artificial values that are meaningless to users
 Normally hidden in forms and reports

Example
 RENTAL_PROPERTY without surrogate key:

RENTAL_PROPERTY (Street, City, State/Province, Zip/PostalCode, Country,


Rental_Rate)

 RENTAL_PROPERTY with surrogate key:

RENTAL_PROPERTY (PropertyID, Street, City, State/Province, Zip/PostalCode,


Country, Rental_Rate)

Exercises

Lets solve following examples

 When to put up a complaint by using phone?

ComplaintNo will be a surrogated key

 ComplaintRef# is normally generated

 Needs attributes?

Surrogated Keys Examples:

Example 1: ATMTransaction

(Card#, Serial#, Amount, DrawDate)

 Need more attributes

Machine#, Type, Status, LimitBalance

Example 2: InsurancePaid

(Policy#, PaidYear, PaidDate, Amount)

 Need more attributes

DueDate, DueAmount, Balance, Charges

Normally insurance is paid every year in advance. Therefore, paid year is a well-defined artificial
key because there is a standard criteria Insurance is managed yearly basis.

Solving Example-1
Now we must choose if invoice# should be handled as a surrogate key or not. (When we are
buying something from the cash and carry store).

Let us choose

Invoice, when buying some items

 Surrogate key is called the fact less key as it is added just for our ease of identification of
unique values and contains no relevant fact (or information) that is useful for the table.
 It contains unique value for all records of the table.

Solving Example-2

Let us discuss

 We visit some wedding hall to book some event.

 How staff of hall will store event’s details?


o Collect all the necessary information about an event (book by, client name, and
payment method etc.)
 Which columns are required?

Reading Material:

In this topic we learned how to analyze the surrogated key.

Solving Example-1 (Saving blogs)

Discuss about Facebook, blogs and twitter. If we want to store data in database, how to decide
which id should be given.

Columns [date, location, ID, Desc]

In this example ID is a surrogated key.

Solving Example-2

 Let’s us assume a loan of Rs. 50,000

 How to decide which columns are required to fill up to keep track record of installments?


o Columns that we need paid amount, due amount, balance, paid date, due date,
penalty and status.

Comparisons of keys

Let us discuss

What is difference between PK & surrogated key?

The main difference between surrogate key and primary key is that surrogate key is a
type of primary key that helps to identify each record uniquely, while the primary key is a set of
minimal columns that helps to identify each record uniquely.

What is difference between PK & unique key?

The primary key is the minimal set of attributes which uniquely identifies any row of a
table. The primary key cannot have a NULL value. It cannot have duplicate values.

Unique Key is a key which has a unique value and is used to prevent duplicate values in a
column. A unique key can have a NULL value which is not allowed in a primary key.

Give three more examples of surrogated keys from real-life.

1.
a. System date
b. time stamp
c. Random alphanumeric string

Reading Material:

Definition:

A foreign key is an attribute that refers to a primary key of same or different relation to form a
link (constraint) between the relations:

 A foreign key can be a single column or a composite key


 The term refers to the fact that key values are foreign to the relation in which they appear
as foreign key values
 Ideally Data type, Length/ Size of FK and referring PK must be same

Example-1

DEPARTMENT (DeptID, DepartmentName, BudgetCode, ManagerName)

EMPLOYEE (EmpNumber, EmpName, DeptID)


In this example DeptID is referred to the foreign key in Employee table that refers to a primary
key in Department table.

Name of FK column may be different from the name of referencing PK

Example-2

CUSTOMER(CustId, Name, Address, Telno, Email)

ORDERS(Invoice#, InvType, InvDate, CustomerId)

In this example CustID is referred to the foreign key in ORDERS table that refers to a primary
key in CUSTOMER table.

Does Invoice# always contain serial numbers?

In this topic, we have to discuss more examples of foreign key. In the last topic, we covered the
characteristics of a foreign key.

1. Foreign key can be NULL.


2. Foreign key can be repeatable.
3. Using a foreign key, create a relationship between two tables.

Relationship details:

Figure represent the basic meaning of relationship details.

FK & Recursive Relationships:

A relationship between two entities of a similar entity type is called a recursive relationship.
Here the same entity type participates more than once in a relationship type with a different role
for each instance. In other words, a relationship has always been between occurrences in two
different entities. However, the same entity can participate in the relationship. This is termed a
recursive relationship.

Figure represent the significance of recursive relationship.

Logical link between tables

Foreign Keys (FKs) Examples:


1. FK is a part of composite key

NOTE: PK column is underlined

ITEM (Item#, Item_Name, Department, Buyer)

ORDER_ITEM (OrderNumber, Item#, Quantity, Price, ExtendedPrice)

Where ORDER_ITEM.ITEM# must exist in ITEM.ITEM#

OrderNumber is an alternate name of InvoiceNumber

2. Referential Integrity Constraint

A referential integrity constraint is a statement that limits the values of the foreign key to those
already existing as primary key values in the corresponding relation

3. Recursive Relationship

Part (Part#, PName, Weight, MakeYear, Price, PType, RefPart#)

PType: Assembled product/ parts (e.g., Motorcycle) or single part

Foreign Keys Examples

 Integrity rules
 Referential with cascade
 Integrity Example

Integrity Rules

Entity integrity

It specifies that:

 No two rows with the same primary key value


 No null values in a primary key

Detail:

The entity integrity constraint states that no primary key value can be NULL. This is because the primary
key value is used to identify individual tuples in a relation. Having NULL values for the primary key
implies that we cannot identify some tuples. For example, if two or more tuples had NULL for their
primary keys, we may not be able to distinguish them if we try to reference them from other relations.
Key constraints and entity integrity constraints are specified on individual relations.
(Reference: Database Systems (FDS), by Ramez Elmasri and Shamkant Navathe, Addison Wesley,
6th Edition.)

Referential integrity

Foreign keys must match candidate key of source table

Foreign keys in some cases can be null

The database must not contain any unmatched foreign key values.

If B references A, A must exist.

Detail:

The referential integrity constraint is specified between two relations and is used to maintain the
consistency among tuples in the two relations. Informally, the referential integrity constraint states that a
tuple in one relation that refers to another relation must refer to an existing tuple in that relation.
(Reference: Database Systems (FDS), by Ramez Elmasri and Shamkant Navathe, Addison Wesley,
6th Edition.)

Referential with cascade

When we update a tuple from a table, say S, that is referenced by another table, say SP,

There are similar choices of referential actions for update and delete:

ON UPDATE CASCADE

ON UPDATE RESTRICT

There could be other choices besides these three. , e.g.,

ON UPDATE SET DEFAULT.

ON UPDATE SET NULL.

 Integrity Example
 Definition of Composite Key
 It is combination of Primary Keys (PK) in a relation of selected attributes gives the concept of
composite key. Composite key is an extended form of a primary key. All the characteristics of PK
applies on composite Keys comprising of more than one column.
 Basic Example of Composite Keys
 During ATM Transaction, amounts can be drawn several times on one ATM Card.
 We need follow the issues
 How much amount will be withdrawn?
 When amount was drawn?
 Which machine has been used?
 What are the types of transaction?
 Card#, Amount, DrawDate, Machine#, TransType are the attributes for transaction
 Other Examples of Composite Keys
 Can we say Card# will be a primary key? NO
 Can DrawDate will be a key with Card#? NO
 Then what to do? Add Surrogated key which is TransNo
 ATMTransaction(Card#, TransNo, Amount, DrawDate, Machine#, TransType)
 Date includes time in seconds as well.
 Many transactions are managed on same date.

 Composite Keys Examples
 Preparing Dataset for Composite Key
 Card(Card#, CardHolderName, ExpiryDate, IssueDate, Acc#)
 ATMTransaction(Card#, TransNo, Amount, DrawDate, Machine#, TransType)
 Answer following questions.
 Do we need most closed related table?
 What are PK and FK?
 Hospital Related Example
 PatientVisit(PatID, VisitSNO, Vdate, DocID, Diagnosis,…)
 LabTest(PatID, VisitSNO, TestID, Tdate, …)
 LabTest has FKs PatID, VisitSNO, referring to same corresponding composite keys in
PatientVisit
 Answer following question.
 Which lab table needs to be considered?
 What about Excluding Composite Key?
 Card(Card#, CardHolderName, ExpiryDate, IssueDate, Acc#)
 ATMTransaction(ID, Card#, TransNo, Amount, DrawDate, Machine#, TransType)
 Answer following questions.
 Do we need most closed related table?
 What is a drawback of using ID as Surrogated Key in ATMTransaction?
 What are PK and FK?
 Composite Keys Examples
 Course Offering Example
 Let’s us assume, there are 1000 number of courses in a table or list.
 Students want to register in courses.
 Can they register by looking at 1000 courses? NO
 Answer following questions.
 What to do?
 Offer selected courses?

 Course Offering Example
 Offering course table
 CrsOffer(SemID, CrsID, Sec, InstrName, B#, R#)
 What to do for keys, PK or Composite Key?
 CrsOffer(SemID, CrsID, Sec, InstrName, B#, R#)
 CrsReg(SemID, Roll#, CrsID, Sec, TotMarks, Grade)

 Preparing Dataset for Composite Keys
 CrsOffer(SemID, CrsID, Sec, InstrName, Building#, Room#)
 CrsReg(SemID, Roll#, CrsID, Sec, TotMarks, Grade)
 Answer following questions.
 Do we need most closed related tables?
 What are PK and FK?

You might also like