0% found this document useful (0 votes)
9 views106 pages

Database For Final

The document discusses various types of relationships in database modeling, including binary, ternary, and n-ary relationships, highlighting the complexities and constraints associated with higher-degree relationships. It also covers Enhanced Entity-Relationship (EER) modeling concepts such as subclasses, superclasses, specialization, and generalization, emphasizing their importance in accurately representing data. Additionally, it outlines guidelines for relational database design, focusing on functional dependencies, normalization, and the avoidance of anomalies in relational schemas.

Uploaded by

techfena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views106 pages

Database For Final

The document discusses various types of relationships in database modeling, including binary, ternary, and n-ary relationships, highlighting the complexities and constraints associated with higher-degree relationships. It also covers Enhanced Entity-Relationship (EER) modeling concepts such as subclasses, superclasses, specialization, and generalization, emphasizing their importance in accurately representing data. Additionally, it outlines guidelines for relational database design, focusing on functional dependencies, normalization, and the avoidance of anomalies in relational schemas.

Uploaded by

techfena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 106

Relationships o Higher Degree

• Relationship types o degree 2 are called binary

• Relationship types o degree 3 are called ternary and o degree n are called n-ary

• In general, an n-ary relationship is not equivalent to n binary relationships

• Constraints are harder to speci y or higher-degree relationships (n > 2) than or


binary relationships

Discussion o n-ary relationships (n > 2)

• In general, 3 binary relationships can represent di erent in ormation than a single


ternary relationship

• I needed, the binary and n-ary relationships can all be included in the schema
design

• In some cases, a ternary relationship can be represented as a weak entity i the


data model allows a weak entity type to have multiple identi ying relationships (and
hence multiple owner entity types)

Example o a ternary relationship


Discussion o n-ary relationships (n > 2)

• I a particular binary relationship can be derived rom a higher-degree relationship


at all times, then it is redundant.

• For example, the TAUGHT_DURING binary relationship in Figure 3.18 (see next
slide) can be derived rom the ternary relationship OFFERS (based on the meaning o the
relationships)

Another example o a ternary relationship

Displaying constraints on higher-degree relationships

• The (min, max) constraints can be displayed on the edges – however, they do not
ully describe the constraints

– Displaying a 1, M, or N indicates additional constraints

– An M or N indicates no constraint

– A 1 indicates that an entity can participate in at most one relationship instance


that has a particular combination o the other participating entities

• In general, both (min, max) and 1, M, or N are needed to ully describe the
constraints
Enhanced Entity-Relationship (EER) Modelling

• EER Model Concepts

– Includes all modelling concepts o basic ER. (EER) diagrams are basically a more
expansive version o ER diagrams.

– Additional concepts:

• subclasses/superclasses

• specialization/generalization

• categories (UNION types)

• attribute and relationship inheritance

– These are undamental to conceptual modelling

• The additional EER concepts are used to model applications more completely and more
accurately

– EER includes some ob ect-oriented concepts, such as inheritance

Subclasses and Superclasses (1)


• An entity type may have additional meaning ul subgroupings o its entities

• Example: EMPLOYEE may be urther grouped into:

– SECRETARY, ENGINEER, TECHNICIAN, …

• Based on the EMPLOYEE’s Job

– MANAGER

• EMPLOYEEs who are managers

– SALARIED_EMPLOYEE, HOURLY_EMPLOYEE

• Based on the EMPLOYEE’s method o pay

• EER diagrams extend ER diagrams to represent these additional subgroupings,


called subclasses or subtypes

Subclasses and Superclasses

Subclasses and Superclasses (2)

• Each o these subgroupings is a subset o EMPLOYEE entities


• Each is called a subclass o EMPLOYEE

• EMPLOYEE is the superclass or each o these subclasses

• These are called superclass/subclass relationships:

– EMPLOYEE/SECRETARY

– EMPLOYEE/TECHNICIAN

– EMPLOYEE/MANAGER

– …

Subclasses and Superclasses (3)

• These are also called IS-A relationships

– SECRETARY IS-A EMPLOYEE, TECHNICIAN IS-A EMPLOYEE, ….

• Note: An entity that is member o a subclass represents the same real-world


entity as some member o the superclass:

– The subclass member is the same entity in a distinct speci ic role


– An entity cannot exist in the database merely by being a member o a subclass; it
must also be a member o the superclass

– A member o the superclass can be optionally included as a member o any


number o its subclasses.

Attribute Inheritance in Superclass / Subclass Relationships

• An entity that is member o a subclass inherits

– All attributes o the entity as a member o the superclass – All relationships o


the entity as a member o the superclass

• Example:

– In the previous slide, SECRETARY (as well as TECHNICIAN and ENGINEER) inherit
the attributes Name, SSN, …, rom EMPLOYEE

– Every SECRETARY entity will have values or the inherited attributes

Specialization (1)

• Specialization is the process o de ining a set o subclasses o a superclass

• The set o subclasses is based upon some distinguishing characteristics o the


entities in the superclass
– Example: {SECRETARY, ENGINEER, TECHNICIAN} is a specialization o EMPLOYEE
based upon ob type.

– May have several specializations o the same superclass

Specialization (2)

• Example: Another specialization o EMPLOYEE based {SALARIED_EMPLOYEE,


HOURLY_EMPLOYEE}. on method o pay is

– Superclass/subclass relationships and specialization can be


diagrammatically

represented in EER diagrams

– Attributes o a subclass are called speci ic or local attributes.

• For example, the attribute TypingSpeed o SECRETARY

– The subclass can also participate in speci ic relationship types.

• For example, a relationship BELONGS_TO o HOURLY_EMPLOYEE

Specialization (3)
Generalization

• Generalization is the reverse o the specialization process

• Several classes with common eatures are generalized into a superclass;

– original classes become its subclasses

• Example: CAR, TRUCK generalized into VEHICLE;

– both CAR, TRUCK become subclasses o the superclass VEHICLE.

– We can view {CAR, TRUCK} as a specialization o VEHICLE

– Alternatively, we can view VEHICLE as a generalization o CAR and TRUCK

Generalization (2)

Generalization and Specialization (1)

• Diagrammatic notation are sometimes used to distinguish between generalization and


specialization

– Arrow pointing to the generalized superclass represents a generalization


– Arrows pointing to the specialized subclasses represent a specialization

– We do not use this notation because it is o ten sub ective as to which process is
more appropriate or a particular situation

– We advocate not drawing any arrows

Generalization and Specialization (2)

• Data Modelling with Specialization and Generalization

– A superclass or subclass represents a collection (or set or grouping) o entities

– It also represents a particular type o entity

– Shown in rectangles in EER diagrams (as are entity types)

– We can call all entity types (and their corresponding collections) classes,
whether they are entity types, superclasses, or subclasses.

Displaying an attribute-de ined specialization in EER diagrams

Constraints on Specialization and Generalization (3)


• Two basic constraints can apply to a specialization/generalization:

– Dis ointness Constraint:

– Completeness Constraint:

• Dis ointness Constraint:

– Speci ies that the subclasses o the specialization must be dis oint:

• an entity can be a member o at most one o the subclasses o the specialization

– Speci ied by d in EER diagram

– I not dis oint, specialization is overlapping:

• that is the same entity may be a member o more than one subclass o the
specialization – Speci ied by o in EER diagram

Constraints on Specialization and Generalization (4)

• Completeness Constraint:

– Total speci ies that every entity in the superclass must be a member o some subclass
in the specialization/generalization
• Shown in EER diagrams by a double line

– Partial allows an entity not to belong to any o the subclasses

• Shown in EER diagrams by a single line

Constraints on Specialization and Generalization (5)

• Hence, we have our types o specialization/generalization:

– Dis oint, total

– Dis oint, partial

– Overlapping, total

– Overlapping, partial

• Note: Generalization usually is total because the superclass is derived rom the
subclasses.

Example o dis oint partial Specialization


Example o overlapping total Specialization

Categories (UNION TYPES) (1)

• All o the superclass/subclass relationships we have seen thus ar have a single


superclass

• A shared subclass is a subclass in:

– more than one distinct superclass/subclass relationships

– each relationships has a single superclass

– shared subclass leads to multiple inheritance

• In some cases, we need to model a single superclass/subclass relationship with


more than one superclass

– Superclasses can represent di erent entity types

– Such a subclass is called a category or UNION TYPE

Categories (UNION TYPES) (2)


• Example: In a database or vehicle registration, a vehicle owner can be a PERSON,
a BANK (holding a license on a vehicle) or a COMPANY.

– A category (UNION type) called OWNER is created to represent a subset o the


union o the three superclasses COMPANY, BANK, and PERSON

– A category member must exist in at least one o its superclasses

• Di erence rom shared subclass, which is a:

– subset o the intersection o its superclasses

– shared subclass member must exist in all o its superclasses

Two categories (UNION types): OWNER, REGISTERED_VEHICLE

The Relational Database Model

• That the relational database model takes a logical view o data

• That the relational model’s basic components are entities, attributes, and
relationships among entities

• A Logical View o Data


– Relational Model

• Enables us to view data logically rather than physically

• Reminds us o simpler ile concept o data storage

– Table

• Has advantages o structural and data independence

• Resembles a ile rom conceptual point o view

• Easier to understand than its hierarchical and network database predecessors

Tables and Their Characteristics

• Table:

– two-dimensional structure composed o rows and columns

– Contains group o related entities an entity set

– Terms entity set and table are o ten used interchangeably

• Table also called a relation because the relational model’s creator, Codd, used the
term relation as a synonym or table
• Think o a table as a persistent relation:

• A relation whose contents can be permanently saved or uture use

Keys

• Consists o one or more attributes that determine other attributes

• Primary key (PK) is an attribute (or a combination o attributes) that uniquely


identi ies any given entity

(row)

• Key’s role is based on determination

– I you know the value o attribute A, you can look up (determine) the value o
attribute B

• Composite key

– Composed o more than one attribute

• Key attribute

– Any attribute that is part o a key


• Superkey

– Any key that uniquely identi ies each entity

• Candidate key

– A superkey without redundancies

Keys (continued)

• Foreign key (FK)

– An attribute whose values match primary key values in the related table

• Re erential integrity

– FK contains a value that re ers to an existing valid tuple (row) in another relation

• Secondary key

– Key used strictly or data retrieval purposes

Null Values
• No data entry

• Not permitted in primary key

• Should be avoided in other attributes • Can represent

– An unknown attribute value

– A known, but missing, attribute value

– A “not applicable” condition

• Can create problems in logic and using ormulas

Chapter 4

Functional Dependency and

Normalization or Relational

Databases

Chapter Outline
1. In ormal Design Guidelines or Relational Databases

1.1 Semantics o the Relation Attributes

1.2 Redundant In ormation in Tuples and Update Anomalies

1.3 Null Values in Tuples

1.4 Spurious Tuples

2. Functional Dependencies (FDs)

2.1 De inition o FD

2.2 In erence Rules or FDs

2.3 Equivalence o Sets o FDs

2.4 Minimal Sets o FDs

Chapter Outline

3. Normal Forms Based on Primary Keys


3.1 Normalization o Relations

3.2 Practical Use o Normal Forms

3.3 De initions o Keys and Attributes Participating in Keys

3.4 First Normal Form

3.5 Second Normal Form

3.6 Third Normal Form

4. General Normal Form De initions (For Multiple Keys)

5. BCNF (Boyce-Codd Normal Form)

1 In ormal Design Guidelines or Relational Databases

• What is relational database design?


– The grouping o attributes to orm "good" relation schemas

• Two levels o relation schemas

– The logical "user view" level

– The storage "base relation" level

• Design is concerned mainly with base relations

– What are the criteria's or "good" base relations?

1.1 Semantics o the Relation Attributes

GUIDELINE 1: In ormally, each tuple in a relation should represent one entity or


relationship instance. (Applies to individual relations and their attributes).

– Attributes o di erent entities (EMPLOYEEs, DEPARTMENTs, PROJECTs) should


not be mixed in the same relation

– Only oreign keys should be used to re er to other entities

– Entity and relationship attributes should be kept apart as much as possible.

• Bottom Line:
– Design a schema that can be explained easily relation by relation.

– The semantics o attributes should be easy to interpret.

Figure: A simpli ied COMPANY relational database schema

1.2 Redundant In ormation in Tuples and Update Anomalies

• In ormation is stored redundantly

– Wastes storage

– Causes problems with

• update anomalies

• Insertion anomalies

• Deletion anomalies

• Modi ication anomalies

• Anomaly: something di erent, abnormal, peculiar, or not easily classi ied


EXAMPLE OF AN UPDATE ANOMALY

• Consider the relation:

– EMP_PROJ(Emp#, Pro #, Ename, Pname, No_hours)

• Update Anomaly:

– Changing the name o pro ect number P1 rom “Billing” to “Customer-Accounting”


may cause this update to be made or all 100 employees working on pro ect P1.

EXAMPLE OF AN INSERT ANOMALY

• Consider the relation:

– EMP_PROJ(Emp#, Pro #, Ename, Pname, No_hours)

• Insert Anomaly:

– Cannot insert a pro ect unless an employee is assigned to it.

• Conversely

– Cannot insert an employee unless he/she is assigned to a pro ect.


EXAMPLE OF DELETE ANOMALY

• Consider the relation:

– EMP_PROJ(Emp#, Pro #, Ename, Pname, No_hours)

• Delete Anomaly:

– When a pro ect is deleted, it will result in deleting all the employees who work
on that pro ect. Alternately, i an employee is the sole employee on a pro ect, deleting
that employee would result in deleting the corresponding pro ect.

Figure: Two relation schemas su ering rom update anomalies

Guideline to Redundant In ormation in Tuples and Update Anomalies

• GUIDELINE 2:

– Design a schema that does not su er rom the insertion, deletion and update
anomalies.

– I there are any anomalies present, then note them so that applications can be
made to take them into account.

1.3 Null Values in Tuples


• GUIDELINE 3:

– Relations should be designed such that their tuples will have as ew NULL values
as possible

– Attributes that are NULL requently could be placed in separate relations (with
the primary key)

• Reasons or nulls:

– Attribute not applicable or invalid

– Attribute value unknown (may exist)

– Value known to exist, but unavailable

1.4 Spurious Tuples

• Bad designs or a relational database may result in erroneous results or certain


JOIN operations

• The "lossless oin" property is used to guarantee meaning ul results or oin


operations

• GUIDELINE 4:
– The relations should be designed to satis y the lossless oin condition.

– No spurious tuples should be generated by doing a natural- oin o any relations.

2.1 Functional Dependencies (1)

• Functional dependencies (FDs)

– A unctional dependency is a constraint that speci ies the relationship between


two sets o attributes where one set can accurately determine the value o other sets.

– Are constraints that are derived rom the meaning and interrelationships o the
data attributes

– Functional Dependency helps to maintain the quality o data in the database.

– It plays a vital role to ind the di erence between good and bad database design.

2.1 Functional Dependencies (2)

• A set o attributes X unctionally determines a set o attributes Y i the value o


X determines a unique value or Y.

• A unctional dependency is denoted by an arrow “→”.


• The unctional dependency o X on Y is represented by X → Y. where X is a set o
attributes that is capable o determining the value o Y.

• The attribute set on the le t side o the arrow, X is called Determinant, while on
the right side, Y is called the Dependent.

Functional Dependencies (3)

• X -> Y holds i whenever two tuples have the same value or X, they must have
the same value or Y

– For any two tuples t1 and t2 in any relation instance r(R): I t1[X]=t2[X], then
t1[Y]=t2[Y]

• X -> Y in R speci ies a constraint on all relation instances r(R)

• FDs are derived rom the real-world constraints on the attributes

Examples o FD constraints (1)

• Social security number determines employee name

– SSN -> ENAME

• Pro ect number determines pro ect name and location

– PNUMBER -> {PNAME, PLOCATION}


• Employee SSN and PROJECT NUMBER determines the HOURS PER WEEK that the
employee works on the pro ect

– {SSN, PNUMBER} -> HOURS

2.2 In erence Rules or FDs (1)

Given a set o FDs F, we can in er additional FDs that hold whenever the FDs in F hold

• Armstrong's in erence rules:

– IR1. (Re lexive) I Y subset-o X, then X -> Y

– IR2. (Augmentation) I X -> Y, then XZ -> YZ

– (Notation: XZ stands or X U Z)

– IR3. (Transitive) I X -> Y and Y -> Z, then X -> Z

• IR1, IR2, IR3 orm a sound and complete set o in erence rules

– These are rules hold and all other rules that hold can be deduced rom these
In erence Rules or FDs (2)

• Some additional in erence rules that are use ul:

– Decomposition: I X -> YZ, then X -> Y and X -> Z

– Union: I X -> Y and X -> Z, then X -> YZ

– Psuedotransitivity: I X -> Y and WY -> Z, then WX -> Z

• The last three in erence rules, as well as any other in erence rules, can be
deduced rom IR1, IR2, and IR3 (completeness property)

In erence Rules or FDs (3)

• Closure o a set F o FDs is the set F+ o all FDs that can be in erred rom F

• Closure o a set o attributes X with respect to F is the set X+ o all attributes


that are unctionally determined by X

• X+ can be calculated by repeatedly applying IR1, IR2, IR3 using the FDs in F

2.3 Equivalence o Sets o FDs

• Two sets o FDs F and G are equivalent i :


– Every FD in F can be in erred rom G, and

– Every FD in G can be in erred rom F – Hence, F and G are equivalent i F+ = G+

• De inition (Covers):

– F covers G i every FD in G can be in erred rom F

– (i.e., i G+ subset-o F+)

• F and G are equivalent i F covers G and G covers F

2.4 Minimal Sets o FDs (1)

• A set o FDs is minimal i it satis ies the ollowing conditions:

1. Every dependency in F has a single attribute or its RHS.

2. We cannot remove any dependency rom F and have a set o dependencies that is
equivalent to F.

3. We cannot replace any dependency X -> A in F with a dependency Y -> A, where Y


proper-subset-o X ( Y subset-o X) and still have a set o dependencies that is
equivalent to F.
3 Normal Forms Based on Primary Keys

3.1 Normalization o Relations

3.2 Practical Use o Normal Forms

3.3 De initions o Keys and Attributes Participating in Keys

3.4 First Normal Form

3.5 Second Normal Form

3.6 Third Normal Form

3.1 Normalization o Relations (1)

• Normalization:

– The process o decomposing unsatis actory "bad" relations by breaking up their


attributes into smaller relations

• reduces data redundancy and eliminates undesirable characteristics like Insertion,


Update and Deletion Anomalies.

• Normal orm:
– Condition using keys and FDs o a relation to certi y whether a relation schema is
in a particular normal orm

Normalization o Relations (2)

• There are three stages o normal orms known as irst normal orm (or 1NF),
second normal orm (or 2NF), and third normal orm (or 3NF).

• 2NF, 3NF, BCNF

– based on keys and FDs o a relation schema

• 4NF

– based on keys, multi-valued dependencies : MVDs;

• 5NF

– based on keys, oin dependencies : JDs (Additional properties may be needed to


ensure a good relational design (lossless oin, dependency preservation; )

3.2 Practical Use o Normal Forms

• Normalization is carried out in practice so that the resulting designs are o high
quality and meet the desirable properties

• The practical utility o these normal orms becomes questionable when the
constraints on which they are based are hard to understand or to detect

• The database designers need not normalize to the highest possible normal orm

– (usually up to 3NF, BCNF or 4NF)

• Denormalization:

– The process o storing the oin o higher normal orm relations as a base
relation— which is in a lower normal orm.

3.3 De initions o Keys and Attributes Participating in Keys (1)

• A superkey o a relation schema R = {A1, A2, ...., An} is a set o attributes S subset
-o R with the property that no two tuples t1 and t2 in any legal relation state r o R will
have t1[S] = t2[S]

• A key K is a superkey with the additional property that removal o any attribute
rom K will cause K not to be a superkey any more.

De initions o Keys and Attributes Participating in Keys (2)

• I a relation schema has more than one key, each is called a candidate key.

– One o the candidate keys is arbitrarily designated to be the primary key, and the
others are called secondary keys.
• A Prime attribute must be a member o some candidate key

• A Nonprime attribute is not a prime attribute—that is, it is not a member o any


candidate key.

3.2 First Normal Form

• Disallows

– composite attributes

– multivalued attributes

– nested relations; attributes whose values or an individual tuple are non-atomic

• Considered to be part o the de inition o relation

Figure: Normalization into 1NF

Figure: Normalization nested relations into 1NF


3.3 Second Normal Form (1)

• Uses the concepts o FDs, primary key

• De initions

– Prime attribute: An attribute that is member o the primary key K

– Full unctional dependency: a FD Y -> Z where removal o any attribute rom Y


means the FD does not hold any more

• Examples:

– {SSN, PNUMBER} -> HOURS is a ull FD since neither SSN -> HOURS nor PNUMBER
-> HOURS hold

– {SSN, PNUMBER} -> ENAME is not a ull FD (it is called a) since SSN -> ENAME
also holds partial dependency

3.3 Second Normal Form (2)

• A relation schema R is in second normal orm (2NF) i every non-prime attribute A


in R is ully unctionally dependent on the primary key.
• R can be decomposed into 2NF relations via the process o 2NF normalization

Figure: Normalizing into 2NF and 3NF

Figure: Normalization into 2NF and 3NF

3.4 Third Normal Form (1)

• De inition:

– Transitive unctional dependency: a FD X -> Z that can be derived rom two FDs
X -> Y and Y -> Z

• Examples:

– SSN -> DMGRSSN is a transitive FD

• Since SSN -> DNUMBER and DNUMBER -> DMGRSSN hold

– SSN -> ENAME is non-transitive

• Since there is no set o attributes X where SSN -> X and X -> ENAME
Third Normal Form (2)

• A relation schema R is in third normal orm (3NF) i it is in 2NF and no non-prime


attribute A in R is transitively dependent on the primary key

• R can be decomposed into 3NF relations via the process o 3NF normalization

• NOTE:

– In X -> Y and Y -> Z, with X as the primary key, we consider this a problem only i
Y is not a candidate key.

– When Y is a candidate key, there is no problem with the transitive dependency .

– E.g., Consider EMP (SSN, Emp#, Salary ).

• Here, SSN -> Emp# -> Salary and Emp# is a candidate key.

Normal Forms De ined In ormally

• 1st normal orm

– All attributes depend on the key

• 2nd normal orm


– All attributes depend on the whole key

• 3rd normal orm

– All attributes depend on nothing but the key

4 General Normal Form De initions (For Multiple Keys) (1)

• The above de initions consider the primary key only

• The ollowing more general de initions take into account relations with multiple
candidate keys

• A relation schema R is in second normal orm (2NF) i every non-prime attribute A


in R is ully unctionally dependent on every key o R

General Normal Form De initions (2)

• De inition:

– Superkey o relation schema R - a set o attributes S o R that contains a key o R

– A relation schema R is in third normal orm (3NF) i whenever a FD X -> A holds in


R, then either:

• (a) X is a superkey o R, or
• (b) A is a prime attribute o R

• NOTE: Boyce-Codd normal orm disallows condition (b) above

5 BCNF (Boyce-Codd Normal Form)

• A relation schema R is in Boyce-Codd Normal Form (BCNF) i whenever an FD X ->


A holds in R, then X is a superkey o R

• Each normal orm is strictly stronger than the previous one

– Every 2NF relation is in 1NF

– Every 3NF relation is in 2NF

– Every BCNF relation is in 3NF

• There exist relations that are in 3NF but not in BCNF

• The goal is to have each relation in BCNF (or 3NF)

Figure: Boyce-Codd normal orm


Figure: a relation TEACH that is in 3NF but not in BCNF

Achieving the BCNF by Decomposition (1)

• Two FDs exist in the relation TEACH:

– d1: { student, course} -> instructor

– d2: instructor -> course

• {student, course} is a candidate key or this relation and that the dependencies
shown ollow the pattern in Figure 10.12 (b).

– So this relation is in 3NF but not in BCNF

• A relation NOT in BCNF should be decomposed so as to meet this property, while


possibly orgoing the preservation o all unctional dependencies in the decomposed
relations. – (See Algorithm 11.3)

Achieving the BCNF by Decomposition (2)

• Three possible decompositions or relation TEACH

– {student, instructor} and {student, course}


– {course, instructor } and {course, student} – {instructor, course } and {instructor,
student}

• All three decompositions will lose d1.

– We have to settle or sacri icing the unctional dependency preservation. But we


cannot sacri ice the nonadditivity property a ter decomposition.

• Out o the above three, only the 3rd decomposition will not generate spurious
tuples a ter oin.(and hence has the non-additivity property).

• A test to determine whether a binary decomposition (decomposition into two


relations) is nonadditive (lossless) is discussed in section 11.1.4 under Property LJ1.
Veri y that the third decomposition above meets the property.
Chapter 6

The Relational Algebra and Calculus

Chapter Outline

• Relational Algebra

– Unary Relational Operations

– Relational Algebra Operations From Set Theory

– Binary Relational Operations

– Additional Relational Operations

– Examples o Queries in Relational Algebra

• Relational Calculus

– Tuple Relational Calculus

– Domain Relational Calculus

• Example Database Application (COMPANY)


• Overview o the QBE language (appendix D)

Relational Algebra Overview

• Relational algebra is a procedural query language, which takes instances o


relations as input and yields instances o relations as output.

• Relational algebra is the basic set o operations or the relational model

• These operations enable a user to speci y basic retrieval requests (or queries)

• The result o an operation is a new relation, which may have been ormed rom
one or more input relations

– This property makes the algebra “closed” (all ob ects in relational algebra are
relations)

Relational Algebra Overview (cont.…)

• The algebra operations thus produce new relations

– These can be urther manipulated using operations o the same algebra


• A sequence o relational algebra operations orms a relational algebra expression

– The result o a relational algebra expression is also a relation that represents the
result o a database query (or retrieval request)

Brie History o Origins o Algebra

• Muhammad ibn Musa al-Khwarizmi (800-847 CE) wrote a book titled al- abr
about arithmetic o variables

– Book was translated into Latin.

– Its title (al- abr) gave Algebra its name.

• Al-Khwarizmi called variables “shay”

– “Shay” is Arabic or “thing”.

– Spanish transliterated “shay” as “xay” (“x” was “sh” in Spain).

– In time this word was abbreviated as x.

• Where does the word Algorithm come rom?

– Algorithm originates rom “al-Khwarizmi"


– Re erence: PBS (http://www.pbs.org/empires/islam/innoalgebra.html)

Relational Algebra Operations

• Relational Algebra consists o several groups o operations

– Unary Relational Operations

• SELECT (symbol: (sigma))

• PROJECT (symbol: (pi))

• RENAME (symbol: (rho))

– Relational Algebra Operations From Set Theory

• UNION ( ), INTERSECTION ( ), DIFFERENCE (or MINUS, – )

• CARTESIAN PRODUCT ( x )

Relational Algebra Operations

• Relational Algebra consists o several groups o operations


– Binary Relational Operations

• JOIN (several variations o JOIN exist)

• DIVISION

– Additional Relational Operations

• OUTER JOINS, OUTER UNION

• AGGREGATE FUNCTIONS (These compute summary o in ormation: or example,


SUM,

COUNT, AVG, MIN, MAX)

Unary Relational Operations: SELECT

• The SELECT operation (denoted by (sigma)) is used to select a subset o the


tuples rom a relation based on a selection condition.

– The selection condition acts as a ilter

– Keeps only those tuples that satis y the quali ying condition

– Tuples satis ying the condition are selected whereas the other tuples are
discarded ( iltered out)
• Examples:

– Select the EMPLOYEE tuples whose department number is 4:

• DNO = 4 (EMPLOYEE)

– Select the Employee tuples whose salary is greater than $30,000:

• SALARY > 30,000 (EMPLOYEE)

Unary Relational Operations: SELECT

• In general, the select operation is denoted by <selection condition>(R) where

– the symbol (sigma) is used to denote the select operator

– the selection condition is a Boolean (conditional) expression speci ied on the


attributes o relation R

– tuples that make the condition true are selected

– appear in the result o the operation

– tuples that make the condition alse are iltered out

• discarded rom the result o the operation


Unary Relational Operations: SELECT (cont...)

• SELECT Operation Properties

– The SELECT operation <selection condition>(R) produces a relation S that has


the same schema (same attributes) as R

– SELECT is commutative:

• <condition1>( < condition2> (R)) = <condition2> ( < condition1> (R))

– Because o commutativity property, a cascade (sequence) o SELECT operations


may be applied in any order:

• <cond1>( <cond2> ( <cond3> (R)) = <cond2> ( <cond3> ( <cond1> ( R)))

– A cascade o SELECT operations may be replaced by a single selection with a


con unction o all the conditions:

• <cond1>( < cond2> ( <cond3>(R)) = <cond1> AND < cond2> AND < cond3>(R)))

– The number o tuples in the result o a SELECT is less than (or equal to) the
number o tuples in the input relation R

Unary Relational Operations: PROJECT


PROJECT Operation is denoted by (pi)

• This operation keeps certain columns (attributes) rom a relation and discards the
other columns.

– PROJECT creates a vertical partitioning

• The list o speci ied columns (attributes) is kept in each tuple

• The other attributes in each tuple are discarded

• Example: To list each employee’s irst and last name and salary, the ollowing is
used:

– LNAME, FNAME,SALARY(EMPLOYEE)

Unary Relational Operations: PROJECT (cont.)

• The general orm o the pro ect operation is:

– <attribute list>(R)

– (pi) is the symbol used to represent the pro ect operation

– <attribute list> is the desired list o attributes rom relation R.


• The pro ect operation removes any duplicate tuples

– This is because the result o the pro ect operation must be a set o tuples

• Mathematical sets do not allow duplicate elements.

PROJECT Operation Properties

– The number o tuples in the result o pro ection <list>(R) is always less or equal
to the number o tuples in R

• I the list o attributes includes a key o R, then the number o tuples in the result
o

PROJECT is equal to the number o tuples in R

– PROJECT is not commutative

• <list1> ( <list2> (R) ) = <list1> (R) as long as <list2> contains the attributes
in <list1>

Examples o applying SELECT and PROJECT operations

Relational Algebra Expressions – Single and intermediate result relations


• We may want to apply several relational algebra operations one a ter the other

– Either we can write the operations as a single relational algebra expression by


nesting the operations, or

– We can apply one operation at a time and create intermediate result relations.

• In the latter case, we must give names to the relations that hold the intermediate
results.

Single expression versus sequence o relational operations (Example)

• To retrieve the irst name, last name, and salary o all employees who work in
department number 5, we must apply a select and a pro ect operation

• We can write a single relational algebra expression as ollows:

– FNAME, LNAME, SALARY( DNO=5(EMPLOYEE))

• OR We can explicitly show the sequence o operations, giving a name to each


intermediate relation:

– DEP5_EMPS DNO=5(EMPLOYEE)

– RESULT FNAME, LNAME, SALARY (DEP5_EMPS)

Unary Relational Operations: RENAME


• The RENAME operator is denoted by (rho)

• In some cases, we may want to rename the attributes o a relation or the relation
name or both

– Use ul when a query requires multiple operations

– Necessary in some cases (see JOIN operation later)

Unary Relational Operations: RENAME (cont...)

• The general RENAME operation can be expressed by any o the ollowing orms:

– S (B1, B2, …, Bn )(R) changes both:

• the relation name to S, and

• the column (attribute) names to B1, B1, …..Bn

– S(R) changes:

• the relation name only to S

– (B1, B2, …, Bn )(R) changes:


• the column (attribute) names only to B1, B1, …..Bn

Unary Relational Operations: RENAME (contd.)

• For convenience, we also use a shorthand or renaming attributes in an


intermediate relation:

– I we write:

• RESULT FNAME, LNAME, SALARY (DEP5_EMPS)

• RESULT will have the same attribute names as DEP5_EMPS (same attributes as
EMPLOYEE)

• I we write:

• RESULT (F, M, L, S, B, A, SX, SAL, SU, DNO) FNAME, LNAME, SALARY


(DEP5_EMPS)

• The 10 attributes o DEP5_EMPS are renamed to F, M, L, S, B, A, SX, SAL, SU, DNO,


respectively

Example o applying multiple operations and RENAME

Relational Algebra Operations rom Set Theory: UNION


• UNION Operation

– Binary operation, denoted by

– The result o R S, is a relation that includes all tuples that are either in R or in S
or in both

R and S

– Duplicate tuples are eliminated

– The two operand relations R and S must be “type compatible” (or UNION
compatible)

• R and S must have same number o attributes

• Each pair o corresponding attributes must be type compatible (have same or


compatible domains)

Relational Algebra Operations rom Set Theory: UNION

• Example:

– To retrieve the social security numbers o all employees who either work in
department 5 (RESULT1 below) or directly supervise an employee who works in
department 5 (RESULT2 below)
– We can use the UNION operation as ollows:

DEP5_EMPS DNO=5 (EMPLOYEE)

RESULT1 SSN(DEP5_EMPS)

RESULT2(SSN) SUPERSSN(DEP5_EMPS)

RESULT RESULT1 RESULT2

– The union operation produces the tuples that are in either RESULT1 or RESULT2 or
both

Example o the result o a UNION operation

Relational Algebra Operations rom Set Theory

• Type Compatibility o operands is required or the binary set operation UNION ,


(also or INTERSECTION , and SET DIFFERENCE –, see next slides)

• R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn) are type compatible i :

– they have the same number o attributes, and


– the domains o corresponding attributes are type compatible (i.e.
dom(Ai)=dom(Bi) or i=1, 2, ..., n).

• The resulting relation or R1 R2 (also or R1 R2, or R1–R2, see next slides) has the
same attribute names as the irst operand relation R1 (by convention)

Relational Algebra Operations rom Set Theory: INTERSECTION

• INTERSECTION is denoted by

• The result o the operation R S, is a relation that includes all tuples that are in
both R and S

– The attribute names in the result will be the same as the attribute names in R

• The two operand relations R and S must be “type compatible”

Relational Algebra Operations rom Set Theory: SET DIFFERENCE (cont.)

• SET DIFFERENCE (also called MINUS or EXCEPT) is denoted by –

• The result o R – S, is a relation that includes all tuples that are in R but not in S

– The attribute names in the result will be the same as the attribute names in R

• The two operand relations R and S must be “type compatible”


Example to illustrate the result o UNION, INTERSECT, and DIFFERENCE

Student Instructor

Relational Algebra Operations rom Set Theory: CARTESIAN PRODUCT

• CARTESIAN (or CROSS) PRODUCT Operation

– This operation is used to combine tuples rom two relations in a combinatorial


ashion.

– Denoted by R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm)

– Result is a relation Q with degree n + m attributes:

• Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order.

– The resulting relation state has one tuple or each combination o tuples—one
rom R and one rom S.

– Hence, i R has nR tuples (denoted as |R| = nR ), and S has nS tuples, then R x S


will have nR * nS tuples.

– The two operands do NOT have to be "type compatible”


Relational Algebra Operations rom Set Theory: CARTESIAN PRODUCT (cont.)

• Generally, CROSS PRODUCT is not a meaning ul operation – Can become


meaning ul when ollowed by other operations

• Example (not meaning ul):

– FEMALE_EMPS SEX=’F’(EMPLOYEE)

– EMPNAMES FNAME, LNAME, SSN (FEMALE_EMPS)

– EMP_DEPENDENTS EMPNAMES x DEPENDENT

• EMP_DEPENDENTS will contain every combination o EMPNAMES and


DEPENDENT

– whether or not they are actually related

Relational Algebra Operations rom Set Theory: CARTESIAN PRODUCT (cont.) - Example

Binary Relational Operations: JOIN


• JOIN Operation (denoted by )

– The sequence o CARTESIAN PRODECT ollowed by SELECT is used quite


commonly to identi y and select related tuples rom two relations

– A special operation, called JOIN combines this sequence into a single operation

– This operation is very important or any relational database with more than a
single relation, because it allows us combine related tuples rom various relations

– The general orm o a oin operation on two relations R(A1, A2, . . ., An) and S(B1,
B2, . . ., Bm) is:

R < oin condition>S

– where R and S can be any relations that result rom general relational algebra
expressions.

Binary Relational Operations: JOIN (cont...)

• Example: Suppose that we want to retrieve the name o the manager o each
department.

– To get the manager’s name, we need to combine each DEPARTMENT tuple with
the EMPLOYEE tuple whose SSN value matches the MGRSSN value in the department
tuple.
– We do this by using the oin operation.

– DEPT_MGR DEPARTMENT MGRSSN=SSN EMPLOYEE

• MGRSSN=SSN is the oin condition

– Combines each department record with the employee who manages the
department

– The oin condition can also be speci ied as DEPARTMENT.MGRSSN=


EMPLOYEE.SSN

Example o applying the JOIN operation

Binary Relational Operations: EQUIJOIN

EQUIJOIN Operation

• The most common use o oin involves oin conditions with equality comparisons
only

• Such a oin, where the only comparison operator used is =, is called an EQUIJOIN.

– In the result o an EQUIJOIN we always have one or more pairs o attributes


(whose names need not be identical) that have identical values in every tuple.
– The JOIN seen in the previous example was an EQUIJOIN.

Binary Relational Operations: NATURAL JOIN Operation

NATURAL JOIN Operation

– Another variation o JOIN called NATURAL JOIN — denoted by * — was created to


get rid o the second (super luous) attribute in an EQUIJOIN condition.

• because one o each pair o attributes with identical values is super luous

– The standard de inition o natural oin requires that the two oin attributes, or
each pair o corresponding oin attributes, have the same name in both relations

– I this is not the case, a renaming operation is applied irst.

Binary Relational Operations: NATURAL JOIN Operation (cont…)

• Example: To apply a natural oin on the DNUMBER attributes o DEPARTMENT and


DEPT_LOCATIONS, it is su icient to write:

– DEPT_LOCS DEPARTMENT * DEPT_LOCATIONS

• Only attribute with the same name is DNUMBER

• An implicit oin condition is created based on this attribute:


– DEPARTMENT.DNUMBER=DEPT_LOCATIONS.DNUMBER

• Another example: Q R(A,B,C,D) * S(C,D,E)

– The implicit oin condition includes each pair o attributes with the same name,
“AND”ed together:

• R.C=S.C AND R.D.S.D

– Result keeps only one attribute o each such pair:

• Q(A,B,C,D,E)

Example o NATURAL JOIN operation

Binary Relational Operations: DIVISION

• DIVISION Operation

– The division operation is applied to two relations

– R(Z) S(X), where X subset Z. Let Y = Z - X (and hence Z = X Y); that is, let Y be
the set o attributes o R that are not attributes o S.
– The result o DIVISION is a relation T(Y) that includes a tuple t i tuples tR appear
in R with tR [Y] = t, and with

• tR [X] = ts or every tuple ts in S.

– For a tuple t to appear in the result T o the DIVISION, the values in t must appear in R
in combination with every tuple in S.

Example o DIVISION

Recap o Relational Algebra Operations

Additional Relational Operations: Aggregate Functions and Grouping

• A type o request that cannot be expressed in the basic relational algebra is to


speci y mathematical aggregate unctions on collections o values rom the database.

• Examples o such unctions include retrieving the average or total salary o all
employees or the total number o employee tuples.

– These unctions are used in simple statistical queries that summarize in ormation
rom the database tuples.
• Common unctions applied to collections o numeric values include

– SUM, AVERAGE, MAXIMUM, and MINIMUM.

• The COUNT unction is used or counting tuples or values.

Aggregate Function Operation

• Use o the Aggregate Functional operation ℱ

– ℱMAX Salary (EMPLOYEE) retrieves the maximum salary value rom the
EMPLOYEE relation

– ℱMIN Salary (EMPLOYEE) retrieves the minimum Salary value rom the
EMPLOYEE relation

– ℱSUM Salary (EMPLOYEE) retrieves the sum o the Salary rom the EMPLOYEE
relation

– ℱCOUNT SSN, AVERAGE Salary (EMPLOYEE) computes the count (number) o


employees and their average salary

• Note: count ust counts the number o rows, without removing duplicates

Examples o Queries in Relational Algebra


• Q1: Retrieve the name and address o all employees who work or the ‘Research’
department.

RESEARCH_DEPT DNAME=’Research’ (DEPARTMENT)

RESEARCH_EMPS (RESEARCH_DEPT DNUMBER=


DNOEMPLOYEEEMPLOYEE)

RESULT FNAME, LNAME, ADDRESS (RESEARCH_EMPS)

• Q6: Retrieve the names o employees who have no dependents.

ALL_EMPS SSN(EMPLOYEE)

EMPS_WITH_DEPS(SSN) ESSN(DEPENDENT)

EMPS_WITHOUT_DEPS (ALL_EMPS - EMPS_WITH_DEPS)

RESULT LNAME, FNAME (EMPS_WITHOUT_DEPS * EMPLOYEE)

Relational Calculus

• A relational calculus expression creates a new relation, which is speci ied in


terms o variables that range over rows o the stored database relations (in tuple
calculus) or over columns o the stored relations (in domain calculus).

• In a calculus expression, there is no order o operations to speci y how to retrieve


the query result—a calculus expression speci ies only what in ormation the result
should contain.

– This is the main distinguishing eature between relational algebra and relational
calculus.

– Relational calculus is considered to be a nonprocedural language.

Relational Calculus

• This di ers rom relational algebra, where we must write a sequence o


operations to speci y a retrieval request; hence relational algebra can be considered as a
procedural way o stating a query.

Tuple Relational Calculus

• The tuple relational calculus is based on speci ying a number o tuple variables.

• Each tuple variable usually ranges over a particular database relation, meaning
that the variable may take as its value any individual tuple rom that relation. • A simple
tuple relational calculus query is o the orm

{t | COND(t)}

– where t is a tuple variable and COND (t) is a conditional expression involving t.

– The result o such a query is the set o all tuples t that satis y COND (t).
Tuple Relational Calculus

• Example: To ind the irst and last names o all employees whose salary is above
$50,000, we can write the ollowing tuple calculus expression:

{t.FNAME, t.LNAME | EMPLOYEE(t) AND t.SALARY>50000}

• The condition EMPLOYEE(t) speci ies that the range relation o tuple variable t is
EMPLOYEE.

• The irst and last name (PROJECTION FNAME, LNAME) o each EMPLOYEE tuple t
that satis ies the condition t.SALARY>50000 (SELECTION SALARY >50000) will be
retrieved.

Relational Algebra vs Relational Calculus

Relational Algebra

Relational Algebra is a procedural language. In Relational Algebra, The order is speci ied
in which the operations have to be per ormed. In Relational Algebra, rameworks are
created to implement the queries. The basic operation included in relational algebra are:

1. Select (σ)

2. Pro ect (Π)

3. Union (U) 4. Set Di erence (-)


5. Cartesian product (X)

6. Rename (ρ)

Relational Algebra vs Relational Calculus

Relational Calculus

Relational Calculus is the ormal query language. It is also known as Declarative


language.

In Relational Calculus, the order is not speci ied in which the operation has to be
per ormed.

Relational Calculus means what result we have to obtain.

• Relational Calculus is denoted as: { t | P(t) }

Where, t: the set o tuples and p: is the condition which is true or the given set o
tuples.

• Relational Calculus has two variations:

1. Tuple Relational Calculus (TRC)


2. Domain Relational Calculus (DRC)

Structured Query Language – The Basics

What We’re Going to Cover

Overview o SQL (This may be review or some o you)

◼Data De inition Language

Creating tables (we’ll ust talk about this)

◼Data Manipulation Language

Inserting/Updating/Deleting data Retrieving data

▪ Single table queries

▪ Where

▪ Joins

▪ Grouping
SQL

SQL is a data manipulation language.

SQL is not a programming language.

SQL commands are interpreted by the DBMS engine.

SQL commands can be used interactively as a query language within the DBMS.

SQL commands can be embedded within programming languages.

3 Types o SQL Commands

Data De inition Language (DDL):

◼ Commands that de ine a database - Create, Alter, Drop

Data Manipulation Language (DML)

◼ Commands that maintain and query a database.

Data Control Language (DCL)


◼ Commands that control a database, including administering privileges and
committing data.

What are the DDL commands in DBMS?

Data de inition language (DDL) is a language that allows the user to de ine the
data and their relationship to other types o data.

Data De inition language statements work with the structure o the database
table.

◼ Various data types used in de ining columns in a database table

◼ Integrity and value constraints

◼ Viewing, modi ying and removing a table structure

DDL Commands

The Data De inition Languages (DDL) Commands are as ollows −

◼ Create − It is used to create a new table or a new database.

◼ Alter − It is used to alter or change the structure o the database table.

◼ Drop − It is used to delete a table, index, or views rom the database.


◼ Truncate − It is used to delete the records or data rom the table, but its structure
remains as it is.

◼ Rename − It is used to rename an ob ect rom the database.

DDL Commands

When you create a table, you have to speci y the ollowing −

◼ Table name.

◼ Name o each column.

◼ Data type o each column.

◼ Size o each column.

Data types

When a table is created, each column in the table is assigned a data type.

Some o the important data types are as ollows

◼ Varchar2

◼ Char
◼ Number

DDL Commands: Syntax

CREATE

Create Database Syntax –

Create database database_name

Create table Syntax –

CREATE TABLE table_name

( column_1 datatype, column_2 datatype, column_3 datatype,

....

);

DDL Commands: Syntax

ALTER
This command is used to add, delete or change columns in the existing table.

Syntax – Syntax to add a column to an existing table.

ALTER TABLE table_name

ADD column_name datatype;

DROP

This command is used to remove an existing table along with its structure rom
the Database.

Syntax: Syntax to drop an existing table.

DROP TABLE table_name;

DDL Commands with example: Create

It is used to create a new table or a new database.

An example o the create database command

◼ create database cs2

An example o the create table command


◼ create table student(stdname varchar(20) , branch varchar(20),college
varchar(20), age int, telephone int, address varchar(20));

DDL Commands with example: Alter and Drop

Alter

It is used to alter or change the structure o the database table

◼ ALTERTABLE student ADD birthdate DATETIME

Drop

It is used to delete a table, index, or views rom the database.

◼ DROP TABLE student

Data Manipulation Language (DML)

Four basic commands:

INSERT

UPDATE
DELETE

SELECT

Create table course

Create table course( course_code varchar(20), course_name varchar(30), credit_hours


int);

Inserting Data into a Table

INSERT INTO tablename (column-list)

VALUES (value-list)

PUTS ONE ROW INTO A TABLE

INSERT INTO COURSE

(COURSE_CODE, COURSE_NAME, CREDIT_HOURS)

VALUES (‘MIS499’,’ADVANCED ORACLE’,4);

More on Inserting Data

INSERT INTO COURSE


VALUES (‘MIS499’,’ADVANCED ORACLE’,4);

COLUMN LIST IS OPTIONAL IF YOU PLAN TO

INSERT A VALUE IN EVERY COLUMN AND IN

THE SAME ORDER AS IN THE TABLE

INSERT INTO COURSE

(COURSE_NAME, COURSE_CODE, CREDIT_HOURS)

VALUES (’ADVANCED ORACLE’,‘MIS499’,4);

COLUMN LIST IS NEEDED

TO CHANGE THEORDER NOTE - TABLE STILL HAS THE

- MUST MATCH VALUE LIST ORIGINAL COLUMN ORDER

Inserting Null Data

INSERT INTO COURSE


(COURSE_CODE, CREDIT_HOURS)

VALUES (‘MIS499’,4);

INSERT INTO COURSE

VALUES (‘MIS499’,’’,4);

INSERT INTO COURSE

VALUES (‘MIS499’,NULL,4); COLUMN LIST IS NEEDED IF

YOU PLAN TO LEAVE OUT A VALUE IN THE VALUE LIST

COLUMN LIST CAN BE OMITTED

IF YOU PUT IN A BLANK VALUE

THE NULL KEYWORD CAN

BE USED TO CREATE A BLANK

COLUMN

ALL OF THESE ASSUME THAT THE DATABASE ALLOWS THE COLUMN TO


BE NULL. YOU CANNOT LEAVE PRIMARY KEYS AND FOREIGN KEYS BLANK

Deleting Data

DELETE COURSE; deletes all rows Be care ul!! This deletes ALL o the rows in
your table. I you use this command in error, you can use ROLLBACK to undo the changes.

DELETE COURSE WHERE COURSE_CODE =

‘MIS220’; deletes speci ic rows (more typical)

DELETE COURSE WHERE HOURS=4; deletes a group o rows delete course where
hours<4;

Updating Data

UPDATE COURSE SET HOURS=5; CHANGES EVERY ROW

UPDATE COURSE SET HOURS=5 WHERE COURSE_CODE=‘MIS220’

CHANGES ONE ROW (MORE TYPICAL)

UPDATE COURSE SET HOURS=3

WHERE COURSE_CODE LIKE ‘MIS%’ CHANGES A GROUP OF ROWS


Rollback and Commit

• Changes to data are temporary during your sql session

• Does not apply to changes in database structure - alter...

Applies to

• Inserts,

• Updates, and

• Deletes

SQL or Retrieving Data rom One Table

SELECT column_name, column_name, …

FROM table_name

WHERE condition/criteria;

This statement will retrieve the speci ied ield values or all rows in the speci ied
table that meet the speci ied conditions.
Every SELECT statement returns a recordset.

Conceptual Evaluation Strategy

Semantics o an SQL query de ined in terms o the ollowing conceptual


evaluation strategy:

◼ Compute the cross-product o relation-list.

◼ Discard resulting tuples i they ail quali ications.

◼ Delete attributes that are not in target-list.

◼ I DISTINCT is speci ied, eliminate duplicate rows.

This strategy is probably the least e icient way to compute a query! An


optimizer will ind more e icient strategies to compute the same answers.

WHERE Conditions

SELECT * FROM COURSE

WHERE COURSE_CODE LIKE ‘MIS%’;

USE % TO SUBSTITUTE FOR


ANY STRING

SELECT * FROM COURSE

WHERE CREDIT_HOURS BETWEEN 3 AND 5;

3 AND 5 ARE INCLUDED

SELECT * FROM CUSTOMER

WHERE BALANCE < CREDIT_LIMIT;

YOU CAN COMPARE TWO

COLUMNS

Customer Table

Create table customer(

Customer_name varchar(20),

Balance loat,

Credit_limit loat,
State_in varchar(10)

);

insert into customer

values('Ayele Gobeze', 200.50, 50, ‘OH');

More WHERE Conditions

SELECT * FROM CUSTOMER

WHERE STATE_IN (‘OH’,’WV’,’KY’);

LIST OF SPECIFIC VALUES TO

LOOK FOR

SELECT * FROM CUSTOMER

WHERE (CREDIT_LIMIT - BALANCE) <1000;

CAN MANIPULATE NUMBER


VALUES MATHEMATICALLY

AND/OR/NOT Conditions

SELECT * FROM CUSTOMER

WHERE BALANCE >=500 AND BALANCE<=1000;

SELECT * FROM CUSTOMER

WHERE STATE = ‘OH’

OR CREDIT_LIMIT>1000;

SELECT * FROM CUSTOMER WHERE NOT (STATE=‘OH’);

TWO COMPARISONS

ON THE SAME COLUMN

TWO COMPARISONS

ON THE DIFFERENT COLUMNS


SAME AS

STATE<>‘OH’

More on AND/OR/NOT

SELECT * FROM CUSTOMER

WHERE STATE = ‘OH’

OR (CREDIT_LIMIT=1000 AND BALANCE <500);

CUST STATE LIMIT BAL

A OH 1000 600 B WV 1000 200 C OH 500 300

D OH 1000 200

E KY 1300 800 F KY 1000 700 G MA 200 100

H NB 1000 100 Use parentheses to make complex logic more


understandable.

Who will be selected??


SQL - Primary Key

• A primary key is a ield in a table which uniquely identi ies each row/record in a
database table. Primary keys must contain unique values.

• A primary key column cannot have NULL values.

• A table can have only one primary key, which may consist o single or multiple
ields.

• When multiple ields are used as a primary key, they are called a composite key.

• I a table has a primary key de ined on any ield(s), then you cannot have two
records having the same value o that ield(s).

Create Primary Key

• Here is the syntax to de ine the ID attribute as a primary key in a CUSTOMERS


table

• To create a PRIMARY KEY constraint on the "ID" column when the CUSTOMERS
table already exists, use the ollowing SQL syntax −
ALTER TABLE CUSTOMER ADD PRIMARY KEY (ID);

Create Primary Key

• For de ining a PRIMARY KEY constraint on multiple columns, use the SQL syntax
given below.

• To create a PRIMARY KEY constraint on the "ID" and "NAMES" columns when
CUSTOMERS table already exists, use the ollowing SQL syntax.

ALTER TABLE CUSTOMERS

ADD CONSTRAINT PK_CUSTID PRIMARY KEY (ID, NAME);

Delete Primary Key

• You can clear the primary key constraints rom the table with the syntax given below.

ALTER TABLE CUSTOMERS DROP PRIMARY KEY ;

Foreign Key
• A oreign key is a key used to link two tables together. This is sometimes also
called as a re erencing key.

• A Foreign Key is a column or a combination o columns whose values match a


Primary Key in a di erent table.

• The relationship between 2 tables matches the Primary Key in one o the tables
with a Foreign Key in the second table.

• I a table has a primary key de ined on any ield(s), then you cannot have two
records having the same value o that ield(s).

Foreign Key - Example

Foreign Key - Example

ALTER TABLE ORDERS

ADD FOREIGN KEY (Customer_ID) REFERENCES CUSTOMERS (ID);

DROP a FOREIGN KEY Constraint

• To drop a FOREIGN KEY constraint, use the ollowing SQL syntax.

ALTER TABLE ORDERS


DROP FOREIGN KEY;

SQL - Other Features

* - All columns in a table Aliases

◼ SELECT EmployeeID, LastName, FirstName,

BirthDate AS DOB FROM Employee;

◼ SELECT EmployeeID, LastName, FirstName, FROM Employee AS E;

Dot Notation - ambiguous attribute names

◼SELECT Customer.LName, E.Lname FROM Customer, Employee AS E WHERE ...

SQL - Other Features

DISTINCT

Arithmetic operators: +, -, *, /
Comparison operators: =, >, >=, <, <=, <>

Concatenation operator: ||

Substring comparisons: %, _

BETWEEN

AND, OR

SQL - Other Features

ORDER BY Clause

UNION, EXCEPT, INTERSECT

IN

SQL or Retrieving Data rom Two or More Tables

SQL provides two ways to retrieve data rom related tables:

Join - When two or more tables are oined by a common ield.

Subqueries - When one Select command is nested within another command.


SQL - Joins

Joins:

The WHERE clause is used to speci y the common ield.

For every relationship among the tables in the FROM clause, you need one
WHERE condition (2 tables - 1 oin, 3 tables - 2 oins…)

SELECT C.Cust_ID, Comp_Name, Country,OrderID

FROM Customer AS C, Order AS O

WHERE C.Cust_ID = O.Cust_ID

AND Country = ‘USA’;


SQL concepts to Cover

SQL

◼ Order by

◼ Group by

◼ Distinct keyword

Advanced SQL

◼ Constraints

◼ Using Joins

◼ Using Views

◼ Indexes

◼ Union Clause

Customers Table
Orders Table

SQL Order by

The SQL ORDER BY clause is used to sort the data in ascending or descending
order, based on one or more columns. Some databases sort the query results in an
ascending order by de ault.

Syntax

SELECT column-list

FROM table_name

[WHERE condition]

[ORDER BY column1, column2, .. columnN] [ASC | DESC];

You can use more than one column in the ORDER BY clause. Make sure whatever
column you are using to sort that column should be in the column-list.

SQL Order by

Example: The ollowing code block has an example, which would sort the result in
an ascending order by the NAME and the SALARY

SELECT * FROM CUSTOMERS

ORDER BY NAME, SALARY;

Example 2: The ollowing code block has an example, which would sort the result
in the descending order by NAME.

SELECT * FROM CUSTOMERS

ORDER BY NAME DESC;

SQL Group by

The SQL GROUP BY clause is used in collaboration with the

SELECT statement to arrange identical data into groups.

This GROUP BY clause ollows the WHERE clause in a SELECT statement and
precedes the ORDER BY clause.

Syntax

SELECT column1, column2


FROM table_name

WHERE [ conditions ]

GROUP BY column1, column2

ORDER BY column1, column2

SQL Group by

In SQL, we use the GROUP BY clause to group rows based on the value o columns.

Example:

◼ I you want to know the total amount o the salary on each customer, then the
GROUP BY query would be as ollows.

SELECT NAME, SUM(SALARY) FROM CUSTOMERS

GROUP BY NAME;

SQL Distinct Keyword

The SQL DISTINCT keyword is used in con unction with the SELECT statement to
eliminate all the duplicate records and etching only unique records.
Syntax:

SELECT DISTINCT column1, column2,.....columnN

FROM table_name

WHERE [condition]

SQL Distinct Keyword

Example:

SELECT DISTINCT SALARY FROM CUSTOMERS

ORDER BY SALARY;

SQL – Using Joins

The SQL Joins clause is used to combine records rom two or more tables in a
database.

A JOIN is a means or combining ields rom two tables by using values common
to each.

Several operators can be used to oin tables, such as =, <, >, <>, <=, >=, !=,
BETWEEN, LIKE, and NOT; they can all be used to oin tables. However, the most
common operator is the equal to symbol.
SQL – Using Joins

There are di erent types o oins available in SQL −

◼ INNER JOIN − returns rows when there is a match in both tables.

◼ LEFT JOIN − returns all rows rom the le t table, even i there are no matches in
the right table.

◼ RIGHT JOIN − returns all rows rom the right table, even i there are no matches in
the le t table.

◼ FULL JOIN − returns rows when there is a match in one o the tables.

◼ SELF JOIN − is used to oin a table to itsel as i the table were two tables,
temporarily renaming at least one table in the SQL statement.

◼ CARTESIAN JOIN − returns the Cartesian product o the sets o records rom the
two or more oined tables.

SQL – Using Joins

Example:

SELECT customers.id, customers.name, customers.age, customers.salary


FROM CUSTOMERS, ORDERS

WHERE CUSTOMERS.ID = ORDERS. ID;

SQL Joins – Inner Join

Syntax:

SELECT column_name(s)

FROM table1

INNER JOIN table2

ON table1.column_name = table2.column_name;

Example: INNER JOIN

SELECT Orders.ID, Customers. Name

FROM Orders

INNER JOIN Customers ON Orders.ID = Customers.ID;

Inner Join
The INNER JOIN keyword selects records that have matching values in both tables.

Inner Join - Example

SELECT ProductID, ProductName, CategoryName

FROM Products

INNER JOIN Categories ON Products.CategoryID =

Categories.CategoryID;

SQL - Using Views

A view is nothing more than a SQL statement that is stored in the database with
an associated name.
A view is actually a composition o a table in the orm o a prede ined SQL query.

A view can contain all rows o a table or select rows rom a table.

A view can be created rom one or many tables which depends on the written SQL
query to create a view. Views are a type o virtual tables

◼ Structure data in a way that users or classes o users ind natural or intuitive.

◼ Restrict access to the data in such a way that a user can see and (sometimes)
modi y exactly what they need and no more.

◼ Summarize data rom various tables which can be used to generate reports.

SQL - Creating Views

Database views are created using the CREATE VIEW statement.

Views can be created rom a single table, multiple tables or another view. You can
also drop views, can insert rows into views and delete rows rom views.

The basic CREATE VIEW syntax is as ollows

CREATE VIEW view_name AS

SELECT column1, column2.....


FROM table_name

WHERE [condition];

SQL - Creating Views

CREATE VIEW CUSTOMERS_VIEW AS

SELECT name, age

FROM CUSTOMERS;

Now, you can query CUSTOMERS_VIEW in a similar way as you query an actual
table. Following is an example or the same.

SELECT * FROM CUSTOMERS_VIEW;

You can also update views

UPDATE CUSTOMERS_VIEW

SET AGE = 35

WHERE name = 'Ramesh';

SQL - Indexes
Indexes are special lookup tables that the database search engine can use to
speed up data retrieval.

Simply put, an index is a pointer to data in a table.

An index helps to speed up SELECT queries and WHERE clauses, but it slows
down data input, with the UPDATE and the INSERT statements.

Indexes can be created or dropped with no e ect on the data. Creating an index
involves the CREATE INDEX statement,

◼ which allows you to name the index,

◼ to speci y the table and which column or columns to index, and

◼ to indicate whether the index is in an ascending or descending order.

SQL - The CREATE INDEX Command

The basic syntax o a CREATE INDEX:

CREATE INDEX index_name ON table_name;

Single-Column Indexes: A single-column index is created based on only one table


column.
CREATE INDEX index_name

ON table_name (column_name);

Composite Indexes: A composite index is an index on two or more columns o a


table.

CREATE INDEX index_name on table_name (column1, column2);

SQL - Indexes

Implicit Indexes: are indexes that are automatically created by the database
server when an ob ect is created. Indexes are automatically created or primary key
constraints and unique constraints.

Unique Indexes: are used not only or per ormance, but also or data integrity.

CREATE UNIQUE INDEX index_name on table_name (column_name);

The DROP INDEX Command: An index can be dropped using SQL DROP command.

DROP INDEX index_name;

Sql – Create Index

CREATE TABLE Bookstore2


(ISBN_NO VARCHAR(15) NOT NULL PRIMARY KEY,

SHORT_DESC VARCHAR(100),

AUTHOR VARCHAR(40),

PUBLISHER VARCHAR(40),

PRICE FLOAT,

INDEX SHORT_DESC_IND(SHORT_DESC, PUBLISHER)

);

SQL – Indexing strategy guidelines

Poorly designed SQL indexes and a lack o them are primary sources o database and
application per ormance issues. Here are a ew indexing strategies that should be
considered when indexing tables:

Avoid indexing highly used table/columns – The more indexes on a table the
bigger the e ect will be on a per ormance o Insert, Update, Delete, and Merge
statements because all indexes must be modi ied appropriately. This means that SQL
Server will have to do page splitting, move data around, and it will have to do that or all
a ected indexes by those DML statements

Use narrow index keys whenever possible – Keep indexes narrow, that is, with as
ew columns as possible. Exact numeric keys are the most e icient SQL index keys (e.g.
integers). These keys require less disk space and maintenance overhead
SQL – Indexing strategy guidelines

Use clustered indexes on unique columns – Consider columns that are unique or
contain many distinct values and avoid them or columns that undergo requent changes

Nonclustered indexes on columns that are requently searched and/or oined on –


Ensure that nonclustered indexes are put on oreign keys and columns requently used
in search conditions, such as Where clause that returns exact matches

Cover SQL indexes or big per ormance gains – Improvements are attained when
the index holds all columns in the query

SQL - Union Clause

The SQL UNION clause/operator is used to combine the results o two or more
SELECT statements without returning any duplicate rows.

To use this UNION clause, each SELECT statement must have

◼ The same number o columns selected

◼ The same number o column expressions

◼ The same data type and

◼ Have them in the same order


But they need not have to be in the same length.

SQL - Union Clause

Syntax: The basic syntax o a UNION clause is as ollows −

SELECT column1 [, column2 ]

FROM table1 [, table2 ]

[WHERE condition]

UNION

SELECT column1 [, column2 ]

FROM table1 [, table2 ]

[WHERE condition]

You might also like