0% found this document useful (0 votes)
7 views

Chapter 1 Query Processing

Chapter 1 discusses query processing and optimization in database management systems, detailing the transformation of high-level SQL queries into efficient execution plans through various phases including syntax checking, query decomposition, optimization, and execution. It emphasizes the importance of minimizing resource usage during query execution and introduces techniques for query optimization, such as heuristic rules and semantic query optimization. The chapter also outlines the roles of query analyzers, semantic analyzers, and the overall aim of achieving efficient data retrieval.

Uploaded by

shumawakjira26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Chapter 1 Query Processing

Chapter 1 discusses query processing and optimization in database management systems, detailing the transformation of high-level SQL queries into efficient execution plans through various phases including syntax checking, query decomposition, optimization, and execution. It emphasizes the importance of minimizing resource usage during query execution and introduces techniques for query optimization, such as heuristic rules and semantic query optimization. The chapter also outlines the roles of query analyzers, semantic analyzers, and the overall aim of achieving efficient data retrieval.

Uploaded by

shumawakjira26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Advanced Database system 2024

Chapter 1
Query Processing and Optimization
Introduction
In this chapter we shall discuss the techniques used by a DBMS to process, Optimize and
execute high level queries.
The techniques used to split complex queries into multiple simple operations and methods
of implementing these low-level operations.
The query optimization techniques are used to choose an efficient execution plan that
will minimize the runtime as well as many other types of resources such as number of
disk I/O, CPU time and so on.
What is Query Processing?
The procedure of transforming high level SQL query into a correct and efficient
execution plan expressed in low-level language.
When a database system receives a query for update or retrieval of information, it
goes through a series of compilation steps, called execution plan.
It goes through various phases.
1. First phase is called syntax checking phase: -the system parses the query and
checks that it follows the syntax rules or not. It then matches the objects in the
query syntax with the view tables and columns listed in the system table. This
phase is divided into three: -Scanning, Parsing, Validating
A. Scanner: The scanner identifies the language tokens such as SQL
Keywords, attribute names, and relation names in the text of the query.
B. Parser: The parser checks the query syntax to determine whether it is
formulated according to the syntax rules of the query language.
C. Validation: The query must be validated by checking that all attributes
and relation names are valid and semantically meaningful names in the
schema of the particular database being queried.

Information Technology Department Page 1


Advanced Database system 2024

2. In second phase the SQL query is translated in to an algebraic expression


using various rules. So that the process of transforming a high level SQL query
into a relational algebraic form is called Query Decomposition. The relational
algebraic expression now passes to the query optimizer.
3. In third phase Optimization is performed by substituting equivalent
expression depends on the factors such that the existence of certain database
structures, whether or not a given file is stored, the presence of different
indexes and so on. Query optimization module work in cycle with the join
manager module to improve the order in which joins are performed.
At this stage the cost model and several other estimation formulas
are used to rewrite the query.
The modified query is written to utilize system resources so as to bring
the optimal performance.
The query optimizer then generates an action plan also called
execution plan. This action plans are converted into a query codes that
are finally executed by a run time database processor.
Query Code Generator: It generates the code to execute the plan.
The run time database processor estimate the cost of each action plan
and chose the optimal one for the execution. It has the task of running
the query code whether in compiled or interpreted mode. If a runtime
error results an error message is generated by the runtime database
processor.

Information Technology Department Page 2


Advanced Database system 2024

Figure 1: -Steps in Processing High-Level Query

What is the aim of query processing?


To transform a query written in a high-level language, typically SQL, into a correct
and efficient execution strategy expressed in a low-level language (implementing the
relational algebra), and to execute the strategy to retrieve the required data.
Query Analyzer
The syntax analyzer takes the query from the users, parses it into tokens and analyses the
tokens (symbols) and their order to make sure they follow the rules of language grammar.
If the error is found in the query submitted by the user, it is rejected and an error
code together with an explanation of why the query was rejected is return to the user.
Query Decomposition
In query decomposition the query processing aims are to transfer the high-level query
into a relational algebra query and to check whether that query is syntactically or
semantically correct. Thus the query decomposition is start with a high level query
and transform into query graph of low-level operations, which satisfy the query.

Information Technology Department Page 3


Advanced Database system 2024

The SQL query is decomposed into query blocks (low-level operations), which form
the basic unit. Hence nested queries within a query are identified as separate query
blocks.
The query decomposer goes through five stages of processing for decomposition
into low-level operation and translation into algebraic expressions.

Query Analysis

During the query analysis phase, the query is syntactically analyzed using the
programming language compiler (parser). A syntactically legal query is then
validated, using the system catalog, to ensure that all data objects (relations and
attributes) referred to by the query are defined in the database.
The type specification of the query qualifiers and result is also checked at this stage.
Example: -SELECT emp_nm FROM EMPLOYEE WHERE emp_desg>100
This query will be rejected because the comparison ">100" is
incompatible with the data type of emp_desg which is a variable character
string.

Information Technology Department Page 4


Advanced Database system 2024

At the end of query analysis phase, the high-level query (SQL) is transformed into
some internal representation that is more suitable for processing. This internal
representation is typically a kind of query tree.
A Query Tree is a tree data structure that corresponds expression.
A Query Tree is also called a relational algebra tree.
Leaf node of the tree, representing the base input relations of the query.
Internal nodes result of applying an operation in the algebra.
Root of the tree representing a result of the query.
SELECT (P.proj_no, P.dept_no, E.name, E.add, E.dob)
FROM PROJECT P, DEPARTMENT D, EMPLOYEE E
WHERE P.dept_no = D.d_no AND D.mgr_id = E.emp_id AND
P.proj_loc = `Mumbai) ;

The three relations PROJECT, DEPARTMENT, EMPLOYEE are represent as a


leaf nodes P, D and E, while the relational algebra operations of the represented
by internal tree nodes.
Same SQL query can have many different relational algebra expressions and
hence many different query trees.
The query parser typically generates a standard initial (canonical) query tree.

Information Technology Department Page 5


Advanced Database system 2024

Query Normalization

The primary phase of the normalization is to avoid redundancy. The normalization


phase converts the query into a normalized form that can be more easily manipulated.
In the normalization phase, a set of equivalency rules are applied so that the projection
and selection operations included on the query are simplified to avoid redundancy.
The projection operation corresponds to the SELECT clause of SQL query and the
selection operation correspond to the predicate found in WHERE clause.
The equivalency transformation rules are applied.

Semantic Analyzer

The objective of this phase of query processing is to reduce the number of predicates.
The semantic analyzer rejects the normalized queries that are incorrectly formulated.
A query is incorrectly formulated if components do not contribute to the generation
of result. This happens in case of missing join specification. A query is contradictory if
its predicate cannot satisfy by any tuple in the relation.
The semantic analyzer examine the relational calculus query (SQL) to make sure it contains only data
objects that is table, columns, views, indexes that are defined in the database catalog. It makes sure
that each object in the query is referenced correctly according to its data type.
In case of missing join specifications the components do not contribute to the
generation of the results, and thus, a query may be incorrect formulated.

Query Simplifier

The objectives of this phase are: -


To detect redundant qualification,
To eliminate common sub-expressions and
To transform sub-graph too semantically equivalent but easier and efficiently
computed form.
Why to simplify?
Commonly integrity constraints, view definitions and access restrictions are introduced into
the graph at this stage of analysis so that the query must be simplified as much as possible.

Information Technology Department Page 6


Advanced Database system 2024

Integrity constraints defines constants which must holds for all state of database, so any
query that contradict an integrity constraints must be avoid and can be rejected without
accessing the database.

Query Restructuring

In the final stage of the query decomposition, the query can be restructured to give a
more efficient implementation. Transformation rules are used to convert one
relational algebra expression into an equivalent form that is more efficient.
The query can now be regarded as a relational algebra program, consisting of a series
of operations on relation.

Query Optimization

The primary goal of query optimization is of choosing an efficient execution strategy for
processing a query. The query optimizer attempts to minimize the use of certain
resources (mainly the number of I/O and CPU time) by selecting a best execution
plan (access plan).
A query optimization start during the validation phase by the system to validate the
user has appropriate privileges. Now an action plan is generate to perform the query.

Information Technology Department Page 7


Advanced Database system 2024

Relational algebra query tree generated by the query simplifier module of query
decomposer.
Estimation formulas used to determine the cardinality of the intermediate result
table.
A cost Model.
Statistical data from the database catalogue.
The output of the query optimizer is the execution plan in form of optimized
relational algebra query.
A query typically has many possible execution strategies, and the process of choosing
a suitable one for processing a query is known as Query Optimization.

The basic issues in Query Optimization

How to use available indexes?


How to use memory to accumulate information and perform immediate steps such
as sorting?
How to determine the order in which joins should be performed?

Information Technology Department Page 8


Advanced Database system 2024

Objective of query optimization

The term query optimization does not mean giving always an optimal (best) strategy
as the execution plan. It is just a responsibly efficient strategy for execution of the
query.
The decomposed query block of SQL is translating into an equivalent extended
relational algebra expression and then optimized.

Techniques for Query Optimization

1. The first technique is based on Heuristic Rules for ordering the operations in a
query execution strategy.
2. The second technique involves the systematic estimation of the cost of the
different execution strategies and choosing the execution plan with the lowest
cost.
3. The third technique is Semantic query optimization: - it is used with the
combination of the heuristic query transformation rules. It uses constraints
specified on the database schema such as unique attributes and other more
complex constraints, in order to modify one query into another query that
is more efficient to execute.

Heuristic Rules

The heuristic rules are used as an optimization technique to modify the internal
representation of query. Usually, heuristic rules are used in the form of query tree of
query graph data structure, to improve its performance.
One of the main heuristic rules is to apply SELECT operation before applying the
JOIN or other BINARY operations. This is because the size of the file resulting
from a binary operation such as JOIN is usually a multi-value function of the sizes of
the input files.
The main idea behind is to reduce intermediate results. This includes performing
SELECT operation to reduce the number of tuples &
PROJECT operation to reduce number of attributes.

Information Technology Department Page 9


Advanced Database system 2024

The SELECT and PROJECT reduced the size of the file and hence, should be
applied before the JOIN or other binary operation. Heuristic query optimizer
transforms the initial (canonical) query tree into final query tree using equivalence
transformation rules. This final query tree is efficient to execute.
Examples for query Optimization: Identify all managers who work in a London branch
SQL:-
SELECT * FROM Staff s, Branch b WHERE s.branchNo = b.branchNo AND
s.position = ‘Manager’ AND b.city = ‘london’;
Results in these equivalent relational algebra statements
1. S(position =’Manager’) ^(city=’London’) ^(Staff.branchNo=Branch.branchNo) (Staff X Branch)
2. S(position =’Manager’) ^(city=’London’) (Staff Staff.branchNo = Branch.branchNo Branch)
3. [S(position =’Manager’)( Staff)] Staff.branchNo = Branch.branchNo [s(city=‘London’)
(Branch)]
Assume:
1000 tuples in Staff.
50 Managers
50 tuples in Branch.
5 London branches
No indexes or sort keys
All temporary results are written back to disk (memory is small)
Tuples are accessed one at a time (not in blocks)
Query 1 (Bad)

Requires (1000+50) disk accesses to read from Staff and Branch relations
Creates temporary relation of Cartesian Product (1000*50) tuples
Requires (1000*50) disk access to read in temporary relation and test predicate
Total Work = (1000+50) + 2*(1000*50) = 101,050 I/O operations

AUWC, School of Technology and Informatics Page 10


Advanced Database system 2024

Query 2 (Better)

Again requires (1000+50) disk accesses to read from Staff and Branch
Joins Staff and Branch on branchNo with 1000 tuples (1 employee : 1 branch )
Requires (1000) disk access to read in joined relation and check predicate
Total Work = (1000+50) + 2*(1000) = 3050 I/O operations
3300% Improvement over Query 1
Query 3 (Best)

Read Staff relation to determine ‘Managers’ (1000 reads)


 Create 50 tuple relation(50 writes)
Read Branch relation to determine ‘London’ branches (50 reads)
 Create 5 tuple relation(5 writes)
Join reduced relations and check predicate (50 + 5 reads)
Total Work = 1000 + 2*(50) + 5 + (50 + 5) = 1160 I/O operations
8700% Improvement over Query 1
Summary of Heuristics for Algebraic Optimization:

1. The main heuristic is to apply first the operations that reduce the size of intermediate
results.
2. Perform select operations as early as possible to reduce the number of tuples and
perform project operations as early as possible to reduce the number of attributes.
(This is done by moving select and project operations as far down the tree as
possible.)
3. The select and join operations that are most restrictive should be executed before
other similar operations. (This is done by reordering the leaf nodes of the tree among
themselves and adjusting the rest of the tree appropriately.)

AUWC, School of Technology and Informatics Page 11


Advanced Database system 2024

Chapter 2
Database Security and Authorization

Introduction to Database Security Issues


 In today's society, some information is extremely important that needs to be
protected. For example, disclosure or modification of military information could
cause danger to national security.
 A good database security management system has to handle the possible database threats.
 Threat may be any situation or event, whether intentional (planned) or accidental, that
may adversely affect a system and consequently the organization.
 Threats to databases : It may results in degradation of some/all security goals like;
 Loss of Integrity
 Only authorized users should be allowed to modify data.
 For example, students may be allowed to see their grades, but not allowed to modify
them.
 Loss of Availability-if DB is not available for those users/ to which they have a
legal right to uses the data.
 Authorized users should not be denied access.
 For example, an instructor who wishes to change a grade should be
allowed to do so.
 Loss of Confidentiality
 Information should not be disclosed to unauthorized users.
 For example, a student should not be allowed to look at other students' grades.
Authentication
All users of the database will have different access levels and permission for different data
objects.
Authentication is the process of checking whether the user is the one with the
privilege for the access level. Thus the system will check whether the user with a
specific username and password is trying to use the resource.

EPU, Information Technology Department Page 1


Advanced Database system 2024

Authorization/Privilege
Authorization refers to the process that determines the mode in which a particular
(previously authenticated) client is allowed to access a specific resource controlled by
a server.
Any database access request will have the following three major components.
1. Requested Operation: what kind of operation is requested by a specific query?
2. Requested Object: on which resource or data of the database is the operation sought to
be applied?
3. Requesting User: who is the user requesting the operation on the specified object?
Forms of user authorization
 There are different forms of user authorization on the resource of the database.
 These include:
1. Read Authorization: the user with this privilege is allowed only to read the
content of the data object.
2. Insert Authorization: the user with this privilege is allowed only to insert new
records or items to the data object.
3. Update Authorization: users with this privilege are allowed to modify content
of attributes but are not authorized to delete the records.
4. Delete Authorization: users with this privilege are only allowed to delete a
record and not anything else.
Note: Different users, depending on the power of the user, can have one or the combination of the
above forms of authorization on different data objects.

EPU, Information Technology Department Page 2


Advanced Database system 2024

Database Administrator
 The database administrator (DBA) is the central authority for managing a database
system.
 The DBA’s responsibilities include
 Account creation
 granting privileges to users who need to use the system
 Privilege revocation
 classifying users and data in accordance with the policy of the organization
Access Protection, User Accounts, and Databases Audits
 Whenever a person or group of persons need to access a database system, the
individual or group must first apply for a user account.
 The DBA will then create a new account id and password for the user if he/she
believes there is a legitimate need to access the database. The user must log in to the
DBMS by entering account id and password whenever database access is needed.
 The database system must also keep track of all operations on the database that are
applied by a certain user throughout each login session.
 If any interfering with the database is assumed, a database audit is performed
 A database audit consists of reviewing the log to examine all accesses and
operations applied to the database during a certain time period.
 A database log that is used mainly for security purposes is sometimes called an audit
trail.
 To protect databases against the possible threats two kinds of countermeasures can
be implemented:
1. Access control and
2. Encryption

EPU, Information Technology Department Page 3


Advanced Database system 2024

Access Control (AC)


1. Discretionary Access Control (DAC)
 The typical method of enforcing discretionary access control in a database system is
based on the granting and revoking privileges.
 The granting and revoking of privileges for discretionary privileges known as the
access matrix model where
 The rows of a matrix M represents subjects (users, accounts, programs)
 The columns represent objects (relations, records, columns, views, operations).
 Each position M(i,j) in the matrix represents the types of privileges (read, write, update)
that subject i holds on object j.
 To control the granting and revoking of relation privileges, each relation R in a
database is assigned an owner account, which is typically the account that was
used when the relation was created in the first place.
 The owner of a relation is given all privileges on that relation.
 The owner account holder can pass privileges on any of the owned relation to
other users by granting privileges to their accounts.
Privileges Using Views

 The mechanism of views is an important discretionary authorization mechanism in


its own right.
 For example, If the owner A of a relation R wants another account B to be able to
retrieve only some fields of R, then A can create a view V of R that includes only
those attributes and then grant SELECT on V to B.
Revoking Privileges
 In some cases it is desirable to grant a privilege to a user temporarily.
 For example, the owner of a relation may want to grant the SELECT privilege to a
user for a specific task and then revoke that privilege once the task is completed.
Hence, a mechanism for revoking privileges is needed.
 In SQL, a REVOKE command is included for the purpose of canceling privileges.

EPU, Information Technology Department Page 4


Advanced Database system 2024

Propagation of Privileges using the GRANT OPTION

 Whenever the owner A of a relation R grants a privilege on R to another account B,


privilege can be given to B with or without the GRANT OPTION. If the GRANT
OPTION is given, this means that B can also grant that privilege on R to other accounts.
 Suppose that B is given the GRANT OPTION by A and that B then grants the
privilege on R to a third account C, also with GRANT OPTION.
 In this way, privileges on R can propagate to other accounts without the
knowledge of the owner of R.
 If the owner account A now revokes the privilege granted to B, all the
privileges that B propagated based on that privilege should automatically be
revoked by the system.
Example 1
 Suppose that the DBA creates four accounts: A1, A2, A3, A4 and wants only A1 to be
able to create relations. Then the DBA must issue the following GRANT command
in SQL:- GRANT CREATE TABLE TO A1;
Example 2
 Suppose that A1 creates the two base relations EMPLOYEE and DEPARTMENT. A1
is then owner of these two relations and hence A1 has all the relation privileges on each
of them. Suppose that A1 wants to grant A2 the privilege to insert and delete rows in
both of these relations, but A1 does not want A2 to be able to propagate these
privileges to additional accounts:
 GRANT INSERT, DELETE ON EMPLOYEE, DEPARTMENT TO A2;
Example 3
 Suppose that A1 wants to allow A3 to retrieve information from either of the table
(Department or Employee) and also to be able to propagate the SELECT privilege to
other accounts. A1 can issue the command:
 GRANT SELECT ON EMPLOYEE, DEPARTMENT TO A3 WITH
GRANT OPTION;
 A3 can grant the SELECT privilege on the EMPLOYEE relation to A4 by issuing:

EPU, Information Technology Department Page 5


Advanced Database system 2024

 GRANT SELECT ON EMPLOYEE TO A4;


 Notice that A4 can’t propagate the SELECT privilege because GRANT OPTION
was not given to A4
Example 4
 Suppose that A1 decides to revoke the SELECT privilege on the EMPLOYEE relation
from A3; A1 can issue:
 REVOKE SELECT ON EMPLOYEE FROM A3;
 The DBMS must now automatically revoke the SELECT privilege on EMPLOYEE
from A4, too, because A3 granted that privilege to A4 and A3 does not have the
privilege any more.
Example 5
 Suppose that A1 wants to give back to A3 a limited capability to SELECT from the
EMPLOYEE relation and wants to allow A3 to be able to propagate the privilege.
The limitation is to retrieve only the NAME, BDATE, and ADDRESS attributes and
only for the tuples with DNO=5.
 A1 then create the view:
 CREATE VIEW A3 EMPLOYEE AS SELECT NAME, BDATE,
ADDRESS FROM EMPLOYEE WHERE DNO = 5;
 After the view is created, A1 can grant SELECT on the view A3 EMPLOYEE to A3
as follows:
 GRANT SELECT ON A3 EMPLOYEE TO A3 WITH GRANT
OPTION;
Example 6
 Finally, suppose that A1 wants to allow A4 to update only the SALARY attribute of
EMPLOYEE; A1 can issue:
 GRANT UPDATE ON EMPLOYEE (SALARY) TO A4;

EPU, Information Technology Department Page 6


Advanced Database system 2024

2. Mandatory Access Control (MAC)


 DAC techniques are an all-or-nothing method: A user either has or does not have a
certain privilege. In many applications, additional security policy is needed that
classifies data and users based on security classes.
 Typical security classes are top secret (TS), secret (S), confidential (C), and
unclassified (U), where TS is the highest level and U the lowest: TS ≥ S ≥ C ≥ U
 The commonly used model for multilevel security, known as the Bell-LaPadula
model, classifies each subject (user, account, program) and object (relation, tuple,
column, view, operation) into one of the security classifications, TS, S, C, or U:
 Clearance (classification) of a subject S as class(S) and to the
classification of an object O as class (O).
 Two restrictions are enforced on data access based on the subject/object
classifications:
 A subject S is not allowed read access to an object O unless class(S) ≥ class (O).
 A subject S is not allowed to write an object O unless class(S) ≤ class (O).
 To incorporate multilevel security notions into the relational database model, it is
common to consider attribute values and rows as data objects. Hence, each
attribute A is associated with a classification attribute C in the schema.
 In addition, in some models, a tuple classification attribute TC is added to the
relation attributes to provide a classification for each tuple as a whole.
 Hence, a multilevel relation schema R with n attributes would be represented as
 R(A1,C1,A2,C2, …, An,Cn,TC) where each Ci represents the classification attribute
associated with attribute Ai.
 The value of the TC attribute in each tuple t – which is the highest of all attribute
classification values within t – provides a general classification for the tuple itself.
 Whereas, each Ci provides a finer security classification for each attribute value
within the tuple.
 A multilevel relation will appear to contain different data to subjects (users) with
different clearance levels.

EPU, Information Technology Department Page 7


Advanced Database system 2024

 In some cases, it is possible to store a single tuple in the relation at a


higher classification level and produce the corresponding tuples at a
lower-level classification through a process known as filtering.

 In other cases, it is necessary to store two or more tuples at different


classification levels with the same value for the apparent key.
 This leads to the concept of polyinstantiation where several tuples can have the
same apparent key value but have different attribute values for users at different
classification levels.
Comparing DAC and MAC
 DAC policies are characterized by a high degree of flexibility, which makes them
suitable for a large variety of application domains.
 The main drawback of DAC models is their weakness to malicious attacks, such
as Trojan horses embedded in application programs.
 By contrast, mandatory policies ensure a high degree of protection in a way; they
prevent any illegal flow of information.
 Mandatory policies have the drawback of being too rigid and they are only
applicable in limited environments.
 In many practical situations, discretionary policies are preferred because they offer a
better trade-off between security and applicability.
3. Role-Based Access Control
 Its basic notion is that permissions are associated with roles, and users are assigned
to appropriate roles. Roles can be created using the CREATE ROLE and DESTROY
ROLE commands.
 The GRANT and REVOKE commands discussed under DAC can then be
used to assign and revoke privileges from roles.
 RBAC appears to be a feasible alternative to discretionary and mandatory access
controls;
 It ensures that only authorized users are given access to certain data or resources.
 Many DBMSs have allowed the concept of roles, where privileges can be assigned to
roles.
EPU, Information Technology Department Page 8
Advanced Database system 2024

 Role hierarchy in RBAC is a natural way of organizing roles to reflect the


organization’s lines of authority and responsibility.
Introduction to Statistical Database Security

 Statistical databases are used mainly to produce statistics on various populations. The
database may contain confidential data on individuals, which should be protected
from user access. Users are permitted to retrieve statistical information on the
populations, such as averages, sums, counts, maximums, minimums, and standard
deviations.
 A population is a set of rows of a relation (table) that satisfy some selection condition.
 Statistical queries involve applying statistical functions to a population of rows. For
example, we may want to retrieve the number of individuals in a population or the
average income in the population.
 However, statistical users are not allowed to retrieve individual data, such as
the income of a specific person.
 Statistical database security techniques must disallow the retrieval of individual data.
 This can be achieved by elimination of queries that retrieve attribute values and by
allowing only queries that involve statistical aggregate functions such as, SUM, MIN,
MAX, and Such queries are sometimes called statistical queries.

 It is DBMS’s responsibility to ensure confidentiality of information about individuals,


while still providing useful statistical summaries of data about those individuals to
users. Provision of privacy protection of users in a statistical database is paramount.
 In some cases it is possible to infer the values of individual rows from sequence
statistical queries.
 This is particularly true when the conditions result in a population consisting
of a small number of rows.

EPU, Information Technology Department Page 9


Advanced Database system 2024

Encryption

 Authorization may not be sufficient to protect data in database systems, especially


when there is a situation where data should be moved from one location to the
other using network facilities. Encryption is used to protect information stored at
a particular site or transmitted between sites from being accessed by unauthorized
users.
 Encryption is the encoding of the data by a special algorithm that renders the data
unreadable by any program without the decryption key. It is not possible for
encrypted data to be read unless the reader knows how to decipher/decrypt the
encrypted data.
 If a database system holds particularly sensitive data, it may be believed necessary
to encode it as an insurance against possible external threats or attempts to access
it. The DBMS can access data after decoding it, although there is degradation in
performance because of the time taken to decode it. Encryption also protects data
transmitted over communication lines.
 To transmit data securely over insecure networks requires the use of a Cryptosystem,
which includes:
1. An encryption key to encrypt the data (plaintext)
2. An encryption algorithm that, with the encryption key, transforms the plaintext
into cipher text
3. A decryption key to decrypt the cipher text
4. A decryption algorithm that, with the decryption key, transforms the cipher text
back into plaintext
 Data encryption standard is an approach which does both a substitution of characters
and a rearrangement of their order based on an encryption key.

EPU, Information Technology Department Page 10


Advanced Database system 2024

Types of Cryptosystems
 Cryptosystems can be categorized into two:
1. Symmetric encryption – uses the same key for both encryption and decryption and
relies on safe communication lines for exchanging the key.
2. Asymmetric encryption – uses different keys for encryption and decryption.
 Generally, Symmetric algorithms are much faster to execute on a computer than those
that are asymmetric. Asymmetric algorithms are more secure than symmetric algorithms.
Public Key Encryption algorithm: Asymmetric encryption
 This algorithm operates with modular arithmetic – mod n, where n is the product of two
large prime numbers.
 Two keys, d and e, are used for decryption and encryption.
 n is chosen as a large integer that is a product of two large distinct prime
numbers, p and q.
 The encryption key e is a randomly chosen number between 1 and n that is
relatively prime to (p-1) x (q-1).
 The plaintext m is encrypted as C= me mod n.
 However, the decryption key d is carefully chosen so that C d mod n = m.
 The decryption key d can be computed from the condition that d x e -1 is divisible by (p-
1)x(q-1). Thus, the legitimate receiver who knows d simply computes Cd mod n = m
and recovers m.
Simple Example: Asymmetric encryption
1. Select primes p=11, q=3.
2. n = pq = 11*3 = 33
3. find phi which is given by, phi = (p-1)(q-1) = 10*2 = 20
4. Choose e=3 ( 1<e<phi)
5. Check for gcd(e, phi) = gcd(e, (p-1)(q-1)) = gcd(3, 20) = 1
6. Compute d (1<d<phi) such that d *e -1 is divisible by phi
Simple testing (d = 2, 3 ...) gives d = 7
7. Check: ed-1 = 3*7 - 1 = 20, which is divisible by phi (20).
Given

EPU, Information Technology Department Page 11


Advanced Database system 2024

Public key = (n, e) = (33, 3)


Private key = (n, d) = (33, 7)
 Now say we want to encrypt the message m = 7
– c = me mod n = 73 mod 33 = 343 mod 33 = 13
– Hence the ciphertext c = 13
 To check decryption we compute
– m = cd mod n = 137 mod 33 =62,748,517 mod 33 = 7
Lab part
SQL*Plus: Release 10.2.0.1.0 - Production on Fri Apr 13 01:35:51 2018
Copyright (c) 1982, 2005, Oracle. All rights reserved.
SQL> connect system/auwc
Connected.
SQL> create user Biniam
2 identified by Biniam
3 default tablespace system
4 temporary tablespace temp
5 quota unlimited on system;
User created.
SQL> create user Tolasa
2 identified by Tolasa
3 default tablespace system
4 temporary tablespace temp
5 quota unlimited on system;
User created.
SQL> create role assistant;
Role created.
SQL> grant create table, create session to assistant;
Grant succeeded.
SQL> grant assistant to Biniam;
Grant succeeded.

EPU, Information Technology Department Page 12


Advanced Database system 2024

SQL> grant assistant to Tolasa;


Grant succeeded.
SQL> connect Biniam/Biniam
Connected.
SQL> connect Tolasa/Tolasa
Connected.
SQL> connect Biniam/Biniam
SQL> SQL> create table tbl (id number, name varchar2(10));
Table created.
SQL> insert into tbl values (1,' Bereket ');
1 row created.
SQL> insert into tbl values (2,'chaltu');
1 row created.
SQL>connect Tolasa/Tolasa
Connected.
SQL> select * from Biniam.tbl;
select * from Biniam.tbl *
ERROR at line 1: ORA-00942: table or view does not exist
SQL> connect Biniam/Biniam
Connected.
SQL> grant select on tbl to Tolasa;
Grant succeeded.
SQL> connect Tolasa/Tolasa
Connected.
SQL> select * from Biniam.tbl;
ID NAME
---------- -----------------------
1 Bereket
2 Chaltu
SQL>connect Biniam/Biniam

EPU, Information Technology Department Page 13


Advanced Database system 2024

Connected.
SQL> revoke select on tbl from Tolasa;
Revoke succeeded.
SQL>connect Tolasa/Tolasa
SQL> select * from Biniam.tbl;
ERROR at line 1: ORA-00942: table or view does not exist
SQL> drop role assistant cascade;
drop role assistant cascade *
ERROR at line 1: ORA-01031: insufficient privileges
SQL> connect system/auwc
Connected.
SQL> drop role assistant cascade;
Role dropped.
SQL> connect Biniam/Biniam
ERROR: ORA-01045: user Biniam lacks CREATE SESSION privilege;
logon denied
Warning: you are no longer connected to ORACLE.
SQL> connect Tolasa/Tolasa
ERROR:ORA-01045: user Tolasa lacks CREATE SESSION privilege;
logon denied
Warning: you are no longer connected to ORACLE.
SQL> connect system/auwc
Connected.
SQL> drop user Biniam cascade;
User dropped.
SQL> drop user Tolasa cascade;
User dropped.

EPU, Information Technology Department Page 14


Advanced Database system 2024

Greatest common divisors

 Let a and b be integers, not both zero. Then the greatest common divisor (GCD) of a
and b is the largest positive integer which is a factor of both a and b. We use gcd (a,
b) to denote this largest positive factor. One can extend this definition by setting
gcd(0,0)=0. Sage uses gcd (a, b) to denote the GCD of a and b. The GCD of any two
distinct primes is 1, and the GCD of 18 and 27 is 9. sage: gcd(3, 59) = 1, sage:
gcd(18, 27) = 9

EPU, Information Technology Department Page 15

You might also like