0% found this document useful (0 votes)
43 views18 pages

W2 DBMS

The document discusses the relational model and SQL. It covers topics like attributes, schemas, keys, relational operators, and the history and basics of SQL. It provides examples and explanations of concepts like relations, tuples, domains, selection, projection, joins, and aggregate functions.

Uploaded by

tarakhmehta874
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views18 pages

W2 DBMS

The document discusses the relational model and SQL. It covers topics like attributes, schemas, keys, relational operators, and the history and basics of SQL. It provides examples and explanations of concepts like relations, tuples, domains, selection, projection, joins, and aggregate functions.

Uploaded by

tarakhmehta874
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

DBMS

WEEK 2
LEC 1: Introduction to Relational Model (Pt.1)

• Attribute
o Attribute Types:
▪ Consider
Students = Roll #, First Name, Last Name, DoB, Passport #, Aadhaar #,
Department relation
▪ The set of allowed values for each attribute is called the domain of
the attribute
• Roll #: Alphanumeric string
• First Name, Last Name: Alpha String
• DoB: Date
• Passport #: String (Letter followed by 7 digits) – nullable
(optional)
• Aadhaar #: 12-digit number
• Department: Alpha String
▪ Attribute values are (normally) required to be atomic; that is
indivisible
▪ The special value null is a member of every domain. This indicates
that the value is unknown
▪ The null value may cause complications in the definition of many
operations
• Schema and instance
o Relation Schema and Instance
▪ A1, A2, …, An are attributes
▪ R = (A1, A2, …, An) is a relation schema. Example: instructor = (ID, name,
dept_name, salary)
Formally, given sets D1, D2, …Dn a relation r is a subset of D1 x D2 x … x Dn . Thus, a relation is a
set of n-tuples (a1, a2, …, an) where each ai ∈ Di
▪ The current values (relation instance) of a relation are specified by a
table
▪ An element t of r is a tuple, represented by a row in a table
Example: instructor ≡ (String (5) x String x String x Number+), where ID ∈ String, name ∈
String, dept_name ∈ String, and salary ∈ Number+
▪ Relations are unordered with unique tuples
• Order of tuples/rows is irrelevant (tuples may be stored in
arbitrary order)
• No two tuples/rows may be identical
• Keys
o Let K ⊆ R, where R is the set of attributes in the relation
o K is the superkey of R if values for K are sufficient to identify a unique tuple
of each possible relation r(R)
▪ Example: {ID} and {ID, name} are both superkeys of instructor
o Superkey K is a candidate key if K is minimal
▪ Example: {ID} is a candidate key for instructor
o One of the candidate keys is selected to be the primary key
o A surrogate key (or synthetic key) is a database is a unique identifier for
either an entity in modeled world or an object in the database
▪ Not derived from application data, unlike a natural (or business) key
which is derived from application data
o Example:
▪ Students = Roll #, First Name, Last Name, DoB, Passport #, Aadhaar #,
Department
▪ Super Key: Roll #, {Roll #, DoB}
▪ Candidate Keys: Roll #, {First Name, Last Name}, Aadhaar #
• Passport # cannot be a key as it is nullable
▪ Primary Key: Roll #
• Can Aadhaar # be a primary key? It may suffice for unique
identification. But Roll # may have additional useful info
▪ Secondary Key/ Alternate Key: {First Name, Last Name}, Aadhar #
▪ Simple Key: Consists of a single attribute.
▪ Composite Key: {First Name, Last Name}
• Consists of more than one attribute to uniquely identify an
entity occurrence
• One or more of the attributes, which make up the key are not
simple keys in their own right
▪ Foreign Key constraint: Value in one relation must appear in another
• Referencing relation
o Enrolment: Foreign keys – Roll #, Course #
• Referenced relation
o Students, Courses
▪ A compound key consists of more than one attribute to uniquely
identify an entity occurrence
• Each attribute, which makes up the key, is a simple key in its
own right
• {Roll #, Course #}

• Schema Diagram for university database

• Relational Query Languages


o Procedural viz-a-viz Non-procedural or Declarative paradigms
▪ Procedural programming requires that the programmer tell the
computer what to do
• That is, how to get the output for the range of required inputs
• The programmer must know an appropriate algorithm
▪ Declarative programming requires a more descriptive style
• The programmer must know what relationships hold between
various entities
o “Pure” languages:
▪ Relational algebra
▪ Tuple relational calculus
▪ Domain relational calculus
o All 3 pure languages are equivalent in computing power

LEC 2: Introduction to Relational Model (Pt.2)


• Relational Operators
o Basic Properties of Relations
▪ A relation is set. Hence,
• Ordering of rows/tuples is inconsequential

• All rows/tuples must be distinct

o Select Operation – selection of rows (tuples)


▪ Relation r

• σ A=B ∧ D>5 (r)


o Project Operation – selection of columns (Attributes)
▪ Relation r

▪ π A, C (r)

o Union of two relations


▪ Relation r, s

▪ r∪s
o Set difference of two relations
▪ Relation r, s

▪ r–s

o Set intersection of two relations


▪ Relation r, s

▪ r∩s
▪ r ∩ s = r – (r – s)
o Joining 2 relations – Cartesian Product
▪ Relation r, s

▪ rxs

▪ r = πR (r x s) ; s = πS (r x s) where R = (A, B) and S = (C, D, E)


▪ Naming issue: Between the 2 relations, there are one or more
columns with the same name
• Name them r.B and s.B (B is the repeated column)

• rxs
o Renaming a table
▪ Allows us to refer to a relation (say E) by more than one name ρX (E)
returns the expression E under the name X
▪ Relation r

▪ r x ρs (r)

o Composition of operations
▪ Can build expressions using multiple operations
▪ E.g.: σA=C (r x s)
• rxs

• σA=C (r x s)

o Joining two relations – Natural join


▪ Let r and s be relations on schemas R and S respectively. Then, the
natural join of relations R and S is a relation on schema R U S
obtained as follows:
• Consider each pair of tuples tr from r and ts from s
• If tr and ts have the same value on each of the attributes in R
∩ S, add a tuple t to the result, where
o t has the same value as tr on r
o t has the same value as ts on s
▪ Example:
• Relations r, s:

• Natural Join: r ⋈ s

ΠA,r.B,C,r.D,E (σr.B=s.B ∧ r.D=s.D (r x s))

• Aggregate Operators
o Can we compute:
▪ SUM
• E.g.: SUMB (σB > 2(r))
▪ AVG
▪ MAX
▪ MIN
• Notes about Relational Languages
o Each query input is a table (or set of tables)
o Each query output is a table
o All data in the output table appears in one of the input tables
o Relational algebra is not Turing complete (a system is called Turing complete
if this system is able to recognize or decide other data-manipulation rule
sets)
• Summary of Relational Operators
LEC 3: Introduction to SQL (Pt.1)
• History of Query Language
o IBM developed Structured English Query Language (SEQUEL) as a part of
System R project. Renamed Structured Query Language (SQL)
o There aren’t any alternatives to SQL for speaking to relational databases
(that is, SQL as a protocol), but there are many alternatives to writing SQL in
the applications
o These alternatives have been implemented in the form of frontends for
working with relational databases. Examples:
▪ SchemeQL and CLSQL, which are probably the most flexible, owing to
their Lisp heritage, they look a lot more like SQL than other frontends
▪ LINQ (in .Net)
▪ ScalaQL and ScalaQuery (in Scala)
▪ SqlStatement, ActiveRecord and many others in Ruby
▪ HaskellDB
▪ …
o There are several query languages that are derived or inspired from SQL. Of
them most effective is SPARQL (SPARQL Protocol and RDF Query Language)
▪ RDL – Resource Description Language
▪ Standardized by W3C Consortium as key technology of semantic web
▪ Versions:
• SPARQL 1.0
• SPARQL 1.1
▪ Used as query languages for several NoSQL systems – particularly the
Graph Databases that use RDF as store
• Data Definition Language (DDL)
o SQL DDL allows the specification of info about relations, including:
▪ The Schema for each relation
▪ The Domain of values associated with each attribute
▪ Integrity Constraints
▪ Also,
• The set of Indices to be maintained for each relation
• Security and Authorization information for each relation
• The Physical Storage Structure of each relation on disk
o Domain Types in SQL
▪ char(n) – Fixed length character string, with user-specified length n
▪ varchar(n) – Variable length character strings, with user-specified
maximum length n
▪ int – Integer (a finite subset of the integers that is machine-
independent)
▪ smallint(n) – Small integer (a machine independent subset of the
integer domain type)
▪ numeric(p,d) – Fixed point number, with user-specified precision of p
digits, with d digits to the right of decimal point. (ex., numeric (3,1)
allows 44.5 to be stored exactly, but not 444.5 or 0.32)
▪ real, double precision – Floating point and double-precision floating
point numbers, with machine-dependent precision
▪ float(n) – Floating point number, with user-specified precision of at
least n digits
o Create Table Construct
▪ An SQL relation is defined using the create table command:
create table r (A1D1, A2D2, …, AnDn),
(integrity-constraint1),

(integrity-constraintk));

• r is the name of the relation


• each Ai is an attribute name in the schema of relation r
• Di is the data type of values in the domain of attribute Ai
create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),

salary numeric(8,2));

• Create table construct – Integrity constraints


o not null
o primary key (A1, …, Am)
o foreign key (Am, …, An) references r

create table instructor (

ID char(5),

name varchar(20) not null,

dept_name varchar(20),

salary numeric(8,2),

primary key (ID),

foreign key (dept_name) references department));

o primary key declaration on an attribute automatically ensures not


null
o Update tables
▪ Insert (DML command)
• insert into instructor values (‘10211’, ‘Smith’, ‘Biology’,
66000);
▪ Delete (DML command)
• Remove all tuples from the student relation
o delete from student
▪ Drop Table (DDL command)
• drop table r
▪ Alter (DDL command)
• alter table r add A D
o Where A is the name of the attribute to be added to
relation r and D is the domain of A
o All existing tuples in the relation are assigned null as
the value for the new attribute
• alter table r drop A
o Where A is the name of an attribute of relation r
o Dropping of attributes not supported by many
databases
• Data Manipulation Language (DML): Query Structure
o Basic Query Structure
▪ A typical SQL query has the form:

select A1, A2, …, An ,


from r1, r2, …, rm
where P
• Ai represents an attribute from ri ‘s
• ri represents a relation
• P is a predicate

The result of an SQL query is a relation

o Select Clause
• The select clause lists the attributes desired in the result of a
query
▪ Corresponds to the projection operation of the relational
algebra
• Example: find the names of all instructors:

select name,

from instructor

• NOTE: SQL names are case insensitive


o Name ≡ NAME ≡ name
• SQL allows duplicates in relations as well as in query results
• To force the elimination of duplicates, insert the keyword distinct
after select
o To find the dept names of all instructors and remove
duplicates:
▪ select distinct dept_name
from instructor
• The keyword all specifies that duplicates should not be removed
o select all dept_name
from instructor
• An asterisk in the select cause denotes all attributes
o select *
from instructor
• An attribute can be a literal with no from clause
o select ‘437’
o Results is a table with one column and a single row with
value ‘437’
o Can give the column a name using:
▪ select ‘A’ as FOO
• An attribute can be a literal with from clause
▪ select ‘A’
from instructor

o Result is a table with one column and N rows (number of


tuples in the instructor table
• The select clause can also contain arithmetic expressions involving
the operation, +, -, * and / and operating on constants and
attributes of tuples
o The query:
▪ select ID, name, salary/12
from instructor
o Would return a relation that is the same as the instructor
relation, except that the value of the attribute salary is
divided by 12 (will be replaced)
o Can rename “salary/12” using the as clause:
▪ select ID, name, salary/12 as monthly_salary
o Where clause
• Specifies conditions that the result must satisfy
▪ Corresponds to the selection predicate of the relational
algebra
• To find all instructors in Comp. Sci. dept
▪ select name
from instructor
where dept_name = ‘Comp. Sci.’
• Comparison results can be combined using the logical connectives
and, or and not
• Comparisons can be applied to results of arithmetic expressions
o From clause
• The from clause lists the relations involved in the query
▪ Corresponds to the Cartesian product operation of the
relational algebra
• Find the Cartesian product instructor X teaches
▪ select *
from instructor, teaches
▪ Generates every possible instructor-teaches pair, with all
attributes from both relations
▪ For common attributes (for example, ID), the attributes in
the resulting table are renamed using the relation’s name
(for example, instructor.ID)
• Cartesian product not very useful directly, but useful combined
with where-clause condition (selection operation in relational
algebra)
LEC 4: Introduction to SQL (Pt.2)
• Additional Basic Operations
o Cartesian Product
o Rename AS Operation
▪ The SQL allows renaming relations and attributes using the as clause:
old_name as new_name
▪ Keyword as is optional and may be omitted
o String Operations
▪ SQL includes a string-matching operator for comparisons on
character strings. The operator like uses patterns that are described
using two special characters:
• percent (%) The % character matches any substring
• underscore ( _ ) The _ character matches any character
▪ Find the names of all instructors whose name includes the substring
“dar”
select name
from instructor
where name like ‘%dar%’
▪ Match the string “100%”
like ‘100%’ escape ‘\’
in that above we use backlash (\) as the escape character
▪ Patterns are case sensitive
▪ Examples:
o ‘Intro%’ matches any string beginning with “Intro”
o '%Comp%’ matches any string containing “Comp” as a
substring
o ‘_ _ _’ matches any string of exactly 3 characters
o ‘_ _ _ %’ matches any string of at least 3 characters
▪ SQL supports a variety of string operations such as
o Concatenation (using “||”)
o Converting from upper to lower case (and vice versa)
o Finding string length, extracting substrings, etc
o Ordering the Display of Tuples
▪ We may specify desc for descending order or asc for ascending order,
for each attribute; ascending order is the default
▪ Can sort on multiple attributes
o Selecting Number of tuples in output
▪ The Select Top clause is used to specify the number of records to
return
▪ The Select Top clause is useful on large tables with thousands of
records. Returning a large number of records can impact
performance
select top 10 distinct name
from instructor
▪ Not all database systems support the SELECT TOP clause
• SQL Server & MS Access support select top
• MySQL supports the limit clause
• Oracle uses fetch first n rows only and rownum
select distinct name
from instructor
order by name
fetch first 10 rows only
o Where Clause Predicates
▪ SQL includes a between comparison operator
▪ Tuple comparison
▪ In Operator
• The in operator allows user to specify multiple values in a
where clause
• The in operator is a shorthand for multiple or conditions
o Duplicates
▪ In relations with duplicates, SQL can define how many copies of
tuples appear in the result
▪ Multiset versions of some of the relational algebra operators – given
multiset relations r1 and r2:
• σθ(r1): If there are c1 copies of tuple t1 in r1 and t1 satisfies
selections σθ, then there are copies of t1 in σθ(r1)
• πA (r): For each copy of tuple t1 in r1, there is a copy of tuple
πA (t1) in πA (r1) where πA (t1) denotes the projection of the
single tuple t1
• r1 x r2: If there are c1 copies of tuple t1 in r1 and c2 copies of
tuple t2 in r2, there are c1 x c2 copies of tuple t1.t2 in r1 x r2

LEC 5: Introduction to SQL (Pt.3)


• Set Operations
o Set operations union, intersect and except
▪ Each of the above operations automatically eliminates duplicates
o To retain all duplicates use the corresponding multiset versions union all,
intersect all and except all
o Suppose a tuple occurs m times in r and n times in s, then it occurs:
▪ m + n times in r union all s
▪ min (m, n) times in r intersect all s
▪ max (0, m - n) times in r except all s
• Null Values
o It is possible for tuples to have a null value, denoted by null, for some of
their attributes
o null signifies an unknown value or that a value doesn’t exist
o The result of any arithmetic expressions involving null is null
o The predicate is null can be used to check for null values
o It is not possible to test for null values with comparison operators, such as =,
<, or <>. We need to use the is null and is not null operators instead
• Three Valued Logic
o Three values – true, false, unknown
o Any comparison with null returns unknown
▪ E.g: 5 < null or null <> null or null = null return unknown
o Three-valued logic using the value unknown:
▪ OR: (unknown or true) = true
(unknown or false) = unknown
(unknown or unknown) = unknown
▪ AND: (true and unknown) = unknown
(false and unknown) = false
(unknown and unknown) = unknown
▪ NOT: (not unknown) = unknown
▪ “P is unknown” evaluates to true if predicate P evaluates to unknown
o Result of where clause predicate is treated as false if it evaluates to
unknown
• Aggregate Functions
o These functions operate on the multiset of values of a column of a relation
and return a value
▪ avg
▪ min
▪ max
▪ sum
▪ count
o Group By
▪ Attributes in select clause outside of aggregate functions must
appear in group by list
• /* erroneous query */
select dept_name, ID, avg(salary)
from instructor
group by dept_name;
o Having clause
▪ Properties are specified using having clause (eg: having avg(salary) >
42000)
▪ Predicates in the having clause are applied after the formation of
groups whereas predicates in the where clause are applied before
forming groups
o Null Values and Aggregates
▪ Total all salaries
select sum(salary)
from instructor;
• Above statement ignores null amounts
• Result in null if there is no non-null amount
▪ All aggregate operations except count(*) ignore tuples with null
values on the aggregated attributes
▪ What if collection has only null values?
• Count returns 0
• All other aggregates return null

You might also like