DB2 Basics
DB2 Basics
DB2 Objects: Database, Table space, Table, Index Space, Index, Storage Group,
View, Synonym, Alias
STOGROUPS
• For DB2 Datasets, we have 2 options for storage allocation –
1. Storage Group
2. User-Defined Vsam
• Dataset allocation is performed by DB2 through IBM’s Data Facility Product (DFP)
• Max no of volumes per Stogroup is 133 (Ideally 3 or 4) and all volumes must be of
same type – 3380, 3390, etc.
• CREATE STOGROUP TESTSG1
VOLUMES(‘VOL1’, ‘VOL2’,…) .
• Default STOGROUP is SYSDEFLT. It is created when DB2 is installed. This should be
avoided.
• The underlying VSAM datasets are created and maintained by DB2 .
• These are not used as plain VSAM datasets but DB2 uses VSAM Media Manager to
access them. Also DB2 does additional formatting of the datasets because of which
they cant be treated like Standard Vsam.
• DB2 can use LDS more efficiently as it has a 4k CI size and has no control information
like ESDS.
• DEFINE CLUSTER –
(NAME(vcat.dsndbc.ddddddd.ssssssss.I0001.Annn) –
LINEAR –
REUSE -
VOLUMES(vol1,vol2,…) -
CYLINDER(pri sec) –
SHAREOPTIONS(3 3)
)-
DATA -
(NAME(vcat.dsndbd.ddddddd.ssssssss.I0001.Annn))
BUFFERPOOL
Data is first read from a table which is in a VSAM dataset and is moved into a bufferpool and
from there it is sent to the requester.
There are totally 60 Bufferpool options – 50 4K Bufferpools (BP0 thru BP49) and 10 32K
bufferpools – BP32K thru BP32K9.
1
Database -
• The total collection of stored data is divied into a number of user databases and a few
system databases.
• It has a group of logically related Tablespaces and Indexspaces, which in turn contain
tables and indexes respectively.
• Database is the unit of START and STOP for the system administrator.
• Default Database is DSNDB04 which is created during installation.
• The parameters used for the creation are –
• STOGROUP - Default is SYSDEFLT
BUFFER POOL - Default is BP0
• DBD is Database Descriptor – It is a control structure used by DB2 to manage the objects
which are under the control of a given database.
• Whenever any DB2 object in a database is created,altered or dropped, the DBD is modified
.
• The DBD contains a mapping of the tablespaces, tables and indexes defined in a database.
• An X lock is acquired on a DBD during the execution of the DDL. So it is better to
• Execute DDL when there is little or no activity.
Table Space –
• Table space contains one or more tables. Index space contains exactly one index. A table
and all of its indexes will be contained within a single database.
• It is a dynamically extended collection of pages. Page is a block of physical storage and it
is the unit of I/O. The pages are all of 4K size for index spaces and 32K or 4K for table
spaces.
• Table space is the unit of recovery and reorganization.
Simple –
can contain one or more tables. Here within a single page, records from
more than one table can be interleaved.
Disadvantages – Affects Concurrent access, data availability, space
management
Partitioned –
can contain exactly one table which is so huge that it cant be maintained as a
single unit.
So it is partitioned based on the value ranges of one or more columns in a
table. For this column or its combination, a clustering index must be defined. The
key can be max 40 bytes . Columns in a partitioning index cannot be updated.
Each partition is independent of one another. Individual partitions can be
associated with different storage groups.
To isolate specific data areas in dedicated datasets.
To improve data availability
To improve recoverability
To encourage parallelism (Query Parallelism is breaking the data access for a
query into multiple I/O streams that is execute in parallel and which is useful in
reducing the overall elapsed time)
For partitioned table space, individual partitions can be reorganized or
recovered
2
Segmented –
Can contain one or more tables. The table space is divided into
segments. Each segment consists of n pages, where
n is a 4x be <= 64
and can contain records of only one table.
LOCKMAX – Max no of Row or Page level Lock for any one user in a tablespace.
If this max is reached, the lock gets escalated to table or tablespace lock.
Values can be
0 -> lock escalation should never occur
SYSTEM -> defaults to the system-wide value specified in DSNZPARMS
Integer from 1 to 2,147,483,647
PRIQTY, SECQTY
PCTFREE – Denotes what % of each page should remain free for future inserts
3
CREATE TABLESPACE ACTSACCT IN STEVDB01
NUMPARTS 32
(PART 1 USING STOGROUP STEVESG
PRIQTY 252
SECQTY 252
ERASE NO
FREEPAGE 0
PCTFREE 0
TRACKMOD YES
COMPRESS YES
,PART 2 USING STOGROUP STEVESG
PRIQTY 252
SECQTY 252
ERASE NO
FREEPAGE 0
PCTFREE 0
TRACKMOD YES
COMPRESS YES
, ……………….
)
BUFFERPOOL BP3
LOCKSIZE PAGE
LOCKMAX 0
LOCKPART YES
CLOSE YES
CCSID EBCDIC;
4
DB2 DATA TYPES:
Numeric -
Smallint - 2 byte (binary integer)
Integer - 4 byte (")
Decimal(p,q) - Packed Decimal; Total p digits and decimal q digits; (p+1) or (p+2)/2
bytes;p<54
Float(p) - If p<22, single precision, 4 bytes, else double precision, 8 bytes
Note: All have a DEFAULT value of 0
String -
Character - String of n bytes (fixed) where n < 255
Note: DEFAULT is blanks
Varchar - String of n bytes (variable) where exact length is n+2, 2 for the actual length
Max size < page size within the tablespac.
Note: DEFAULT is a null string
Date/Time -
Date - unsigned packed decimal in format yyyymmdd - 4 bytes
Time - " " " hhmmss - 3 bytes
Timestamp -same - yyyymmddhhmmssnnnnnn - 10 bytes
Note: DEFAULT is CURRENT DATE/TIME/TIMESTAMP
Table –
• Table that is physically stored within a table space contains one or more
stored records.
• There will be one stored record per row in the base table (the table that is
visible to the user).
• The stored record is a byte string containing
o A prefix denoting the internal system identifier
o For each field, a length prefix denoting the actual length, followed
by the actual field value in a encoded form.
o The stored records are internally addressed by RID – Record
identifier – this consists of the page number within the table space and a byte
offset from the foot of the page. This in turn contains the byte offset from the top
of the page. This way when the records are rearranged within a page, the RID
wont change. Only the local offset at the foot of the page will change.
Special Registers:
They are Zero Argument built-in Scalar functions, which return a scalar value.
3. CURRENT SERVER - Returns the ID of the current server (useful in Distributed Database
mgmt)
6. CURRENT TIMEZONE - Returns a time duration representing the displacement of the local
5
time zone from Greenwich Mean Time.
Note:
Table can also be created as
ALTER TABLE:
And this command is also used to add or drop constraint - primary/foreign keys/check
Types of Constraints:
(1)Check Constraint:
-->Used to enforce specific restriction on the values of a column.
This will be checked for every update/insert .
-->Can use condition operators,between,in,like,null and can have multiple condtions combined
using 'and' / 'or'
-->CONSTRAINT PHONE_CHK CHECK (PHONENO >= '0000000' AND PHONENO <= '9999999')
The 2nd operand can be another column. But they shd be of same data type as first.
-->Advantages:
(1)The basic business rules which need to be applied can be done at the database
level. And this can save additional programming in applications which modify that
data.
(2)It ensures consitency and data integrity -
6
as it avoids bypassing of rules by adhoc data modification
as the rules are applied everytime data is modified
-->Watch out the following- (these is not checked)
(1)If the conditions coded contradict each other,
like check(Phoneno > '1000000' and phoneno < '0999999')
(2)If the conditions coded contradict the defaults
(3)redundant conditions are allowed - this cud impact performance
(2)Referential Integrity:
-->Means of ensuring data integrity between related tables using a parent-child relationship
The one with Primary key is parent. The one with foreign key is child table.
Note; The same constraint when enforced through the application, it consumes more
resources
(3) Primary Key is a unique identifier for each row in a table. It can be made of one or more
columns.
These columns can never be NULL.
Note:A table can be defined without a primary key.
But it is available for use only if the uniqueness is enforced by defining a UNIQUE INDEX
on the same columns.
Other unique keys in the table are called alternate keys. To enforce them also, we need
unique indexes. They also cant be NULL.
Defined as
UNIQUE(<COLUMN>) or <column> char(3) NOT NULL UNIQUE
7
Index Space -
• There is always a one-to-one correspondence between index and index
space. Index space is automatically created when we create an index.
• They can be partitioned if the index it contains is a clustering index for a
partitioned table space. Others are simple.
• It is the unit of recovery and reorganization. For partitioned Index space,
individual partitions can be reorganized or recovered.
Index –
• Index defines the logical ordering imposed on the stored data.
• They are defined on one column or a combination of columns.
• Useful for fast sequential access of the indexed data.
• For each distinct value of the index, the pointers, i.e, the RIDs
Of all the stored records that have that value are stored.
• Index Scan is used when an exhaustive search is to be done based on the
index value. This is based on the sequence in which the indexed values are stored.
• Table space scan is based on the physical sequence of the records.
• Clustering Index – is one for which the records are physically maintained
in the sequence defined by the index. The index controls the physical placement of the
indexed recs.
8
VIEWS
-> Views are virtual tables based on one or more base tables.
-> When data in the base table changes, these changes are also reflected in the view.
->They must be defined for a specific use and for one of the following advantages they
provide –
• To provide row and column level security
By limiting the select items to the columns to which the user should have access
and/or by limiting the rows by using proper conditions in WHERE clause.
• To ensure optimal access
By using join criteria and by using indexed columns in predicates
• To ensure same calculation
By using data derivation formulas in the select list
• To Mask complexity of a query from DB2 beginners
• To Support Domains
Note:Domain refers to a valid range of values that a column can contain.
The Table CHECK Constraint is used to create the domains.
The WITH CHECK OPTION is used to ensure that the data modification using
Update/Insert conforms to the WHERE conditions specified in the view
definition.
This option is of 2 types-
WITH CASCADED CHECK OPTION – The check option is applied to the current view
and all the views it accesses regardless of whether it is specified or not.
WITH LOCAL CHECK OPTION – The check option is applied to the views where it is
Specified.
• To rename the columns so that the user can understand them better.
In a View with
Derived Data
Constants
Without columns that don’t have default value
Cannot do Insert
9
SYNONYMS
• An alternative private name for a table or a view
• It can be used only by its creator.
• It cannot refer to a remote table.
• When a table/view is dropped, all synonyms defined on it are also dropped
ALIASES
• An alternative private name for a table or a view
• It can be used by everyone.
• It can refer to a remote table.
• When a table/view is dropped, all aliases defined on it are also dropped
• It provides a useful level of indirection between application programs and the data
tables.
It is the standard query language used by many Relational database products like –
DB2, Oracle, Sybase,etc.
Features of SQL ( or sequel) –
• It is a high level language as compared to procedural languages- because it provides a
greater degree of abstraction . In procedural languages, we process record by record –
we specify what data is to be accessed and how to access them.
Sql on the other hand requires that that programmer specify only what is needed, but
cannot specify how to retrieve them. The optimal instructions for data navigation
(called the access path) is determined by the database itself.
• Sql is not only a query language – but it is also used to define data structures, insert,
modify, delete data and control access to the data. The language is common to the
different users – like DBAs, application programmers, etc.
• All database operations using Sql are at Set-Level unlike record level processing using
flat files. This includes querying a table and the result is in the form of a subset of the
original table. Similarly, updates and deletes also can be done at set level.
2) Data Definition Language(DDL) – Creates and maintains physical data structures using
CREATE, ALTER and DROP verbs.
3) Data Manipulation Language(DML) – Accesses and Modifies Data using INSERT,
SELECT, UPDATE and DELETE.
4) Data Control Language(DCL) – Control Data Security using GRANT and REVOKE.
10
Rules for Sql –
1) Every Query must access atleast one object and select atleast item.
2) The items selected can be one or more columns from a table, a literal, special
Register, an expression result, an embedded select that returns a single row.
2) The object accessed can be a table, a view, alias or full select.
Examples:
1) Getting the the maximum average salary and the dept that has that value.
SQL Predicates – these are conditions used to identify desired rows in a query. The
conditions can be True, False or Unknown .
We can define a temp table. Query from that temp table or join that temp table
with a real table.
We can also use temp table to insert data into real tables.
11
Wherever we use a temp table, we can use a full select in its place
Use of CAST:
It is used to convert one data type to another – like
CAST(col as <new data type>)
E.g:
CAST(salary as INTEGER) where sal is decimal
CAST(job as CHAR(3))
Padding with blanks or truncation will occur depending on the case.
VALUES :
This is used to define a set of rows and columns which will be assigned to a view or a
temporary table. This data can be treated as though it were an ordinary table except that
updates cant be performed.
With Temp1(col1,col2,col3) as
Values ((c11,c12,c13),(c21,c22,c23)…)
CASE :
This enables conditional processing within an sql statement.
They work in 2 ways –
(1)one where every condition in WHEN clause is independently checked
(2)where every WHEN checks for equality against a common expression.
12
………….
Max(Case Month when 12 then Sales_Amt else null end) as Dec
From Sales where year=2001
It can be used to set different values in a SET clause in a UPDATE statement. (and can also
be nested) .
Update Project
Set Salary = Case Designation
When ‘Associate’ then 1000
When ‘Senior Associate’ then
Case Role
When ‘tl’ then 2000
When ‘pl’ then 3000
Else null
end
Else null
End
Functions:
They are of 2 types –
(1) One that acts on a particular column in all the rows that got selected. (Column
fn)
(2) One that acts on one row at a time. (Scalar fn)
13
5) Chr – It returns the ascii value for input in the range 0 to 255.
6) Coalesce – It is a synonym for Value – It returns the first non-null value in a list of
compatible input values. Coalesce(c1,c2,c3).
7) Concat – It is used to concatenate 2 strings – It can be used as str1 || str 2 or as str1
concat str2 or as Concat(str1,str2).
8) Date – This will convert to char string into equivalent date
If the input represents a valid date/timestamp – it is converted as it is
If the input is 7 bytes long – it is assumes as yyyynnn – julian and converted.
If input is numeric – it is considered as no of days since 0001-01-01
week('2002-07-21'),
year('2002-07-21'),
lcase(name),ucase(name),
length(name) ---- always returns same value for all rows except varchar
-- returns the table partition where the row is ; returns 0 if it is not partiotioned;cant be
used with fields on Group by as we cant associate the value to a specific row --
nodenumber(empno),
-----Repeat(string,no of times)
repeat('India',5),
14
----substr(string,start,len) - error if len>(length(string)-start)
substr(name,3,5),
from emp0710.
Order By – It acts on the final output of a query, sorts the data in the requested form.
It cant be used on intermediate results (like subquery).
We can order by Column name (this need not be one of the selected
Column or we can order by column no or we can order by expression –
like substr(name,2)
select empno,salary,bonus
from emp0710
order by 2 --> using column no
select empno,salary,bonus
from emp0710
order by dob desc --> using a specific colm which is not in select also
select empno,salary,bonus
from emp0710
order by substr(empno,2) --> using a fn output
Group By – It is used to combine multiple rows into 1. And Having is used to select which
of the groups are to be retrieved.
Rules: There can be only one group by per select . But we can have one per select in a
multi select query.
The select items must either be those in the the group by or they must be
column functions .
The result of the query shd have a distinct set of rows where the unique
Identifier is the fields grouped on
All Null values in the group by fields are considered equal.
We can group by fields as well as expressions – sum(salary+comm)
Select Avg(Bonus),count(*),sum(salary+comm.)
From Employee
Where Dept <> 100
Group by sum(salary+comm.)
Having Avg(Bonus) > 10000.
15
Select Dept,Avg(Bonus),count(*),count(Empno),sum(salary+bonus)
From Employee
Where Dept <> 100
Group by Dept
Having Avg(salary) > 8000
Select Dept,Avg(Bonus),count(*),count(Empno),sum(salary+bonus)
From Employee
Group by Dept
Having Avg(salary) > 8000 ----> To test and see
Where year(hiredate) <> 2001
Select Dept,Avg(Bonus),count(empno),sum(salary+bonus)
From Employee
Group by year(hiredate)
Joins - They are used to get data from more than one table linking them based on the
Relationship between the tables.
Join Types:
Outer Join:
Useful when we want rows that have the matching values and also the ones
that don’t have matching rows in the other table.
16
table.
OR AS
NOTE:
THE JOIN CONDITION CAN BE ANY OF THE COMPARISON OPERATORS.
SELECT A.DEPTNO,A.DEPTNAME,A.ADMRDEPT,B.DEPTNAME
FROM DEPARTMENT A
DEPARTMENT B
WHERE B.ADMRDEPT = A.DEPTNO
CARTESIAN PRODUCT:
It results when no join condition is specified or the condition doesnt match any
values
and all permutation and combinations of data result.
or as
SELECT EMPNO,(FNAME || ' ' || LNAME) AS NAME,WORKDEPT,DEPTNAME,JOB
FROM EMPLOYEE A,
DEPARTMENT B
where A.WORKDEPT <> B.DEPTNO
17
FROM EMPLOYEE A
LEFT OUTER JOIN
DEPARTMENT B
ON A.EMPNO = B.MGRNO
SUBQUERIES:
Syntax:
SOME: The subquery is true if any row matches (same for ANY)
ALL : The subquery is true ONLY if ALL rows match
IN : matches one of the values listed by the subquery
EXISTS: there exists atleast one matching row in the subquery.
Example:
SELECT EMPNO,WORKDEPT,DEPTNAME
FROM EMPLOYEE A
WHERE WORKDEPT = ANY(SELECT DEPTNO FROM DEPARTMENT);
SELECT A.EMPNO,A.WORKDEPT,A.DEPTNAME
FROM EMPLOYEE A
WHERE A.WORKDEPT = (SELECT B.DEPTNO FROM DEPARTMENT B WHERE
A.WORKDEPT=B.DEPTNO);
Examples:
SELECT EMPNO,WORKDEPT,DEPTNAME
FROM EMPLOYEE A
WHERE WORKDEPT = ANY(SELECT DEPTNO FROM DEPARTMENT);
SELECT EMPNO,WORKDEPT,DEPTNAME
FROM EMPLOYEE A
WHERE WORKDEPT = SOME(SELECT DEPTNO FROM DEPARTMENT);
SELECT EMPNO,WORKDEPT,DEPTNAME
FROM EMPLOYEE A
WHERE WORKDEPT IN (SELECT DEPTNO FROM DEPARTMENT);
18
WHERE A.SALARY >= ALL (SELECT SALARY FROM EMPLOYEE B)
SELECT *
FROM EMPLOYEE A
WHERE (A.BONUS,A.SALARY) > (SELECT AVG(B.BONUS),AVG(B.SALARY)
FROM EMPLOYEE B)
For an application program to access DB2 data, the sql statements have to be embedded in
the program statements of the high level language – like COBOL, PL1, etc.
19
Error Handling Whenever
Comparison of Application that accesses data from Flat files VS Application accessing DB2 data
Though not mandatory, declare all the tables which are used in the application.
This should be declared in the working-storage section before any other db2-related
variable. This helps in reducing the precompiler’s work.
This is done using DCLGEN command. It reads the DB2 catalog to determine the
structure of the table and builds a Cobol Copybook. This copybook has the
Embedded ‘Declare table’ statement and also working-storage host variable definitions
for each column in the table.
However, it is not mandatory that only this copybook should be used in the program.
We can also hardcode the same information in the program, but not a good practice.
The fields defined in it are used to describe the success or failure of execution of each
embedded Sql.
20
SQLSTATE - Similar to SQLCODE . Has a 2 char Class code denoting the error type
and a 3 Char subclass Code denoting the exact error in that error type.
A group of Sqlcodes are associated to a single sqlstate.
The return code can be checked after every sql and action taken depending on the
requirement.
Host Variables -
Host Variable is an area of storage allocated by the host language and referenced in an SQL
statement.
They are the means of moving data from the program to Database and vice-versa.
In addition they also are used as values to be compared with in predicates and are
dynamically populated at run time.
Host Structures
They are group level data containing a set of host variables and if all of them have to be used
As the target of retrievel , the into will refer only to :Group_Data.
The sql defined in a cursor may be executed at the time of OPEN Cursor or for every fetch.
A Cursor cannot be used for updates if it is defined with any one of the following –
UNION
DISTINCT
GROUP BY
ORDER BY
JOIN
SUBQUERY
CORRELATED SUBQUERY
21
TABLES IN RO/UT MODES
READ-ONLY VIEWS
Note:
LOAD outperforms individual inserts by 50-75%
ISOLATION LEVEL specified at the statement level is preferable because it can vary from one
stmt to another and it overrides that in plan or package.
Cursors are implicitly closed when a COMMIT is issued or when the program ends.
Using WITH HOLD helps in retaining the result set table for the select defined in the cursor.
This way we can continue to process the cursor from where we left.
Otherwise the cursor will have to opened and repositioned.
While Declaring cursors it is a good practice to specify ‘FOR FETCH (READ) ONLY’ or ‘FOR
UPDATE OF Col1, Col2 ,.
At the SQL Level also we can specify Isolation Levels using WITH clause.
This will override what was specified during Bind Package/Plan.
This can be done For –
Declare Cursor
Select Into
Insert
Update with Where
Delete with Where
COMMIT_RESTART Logic:
COMMIT is the DB2 statement which saves the updates made to DB2 tables since the start of
the program or the last COMMIT.
It does so by physically applying the all the changes to the DB2 Log.
When a program abends, the changes are rolled back to the last Commit / Sync point.
For Example, suppose we have an application that involved update of 2 tables which are
related to each other. And suppose the update of table1 is COMMITed and the program abends
before COMMITing the UPDATE to the other. The data in the 2 tables will no longer be
synchronous. To Achieve data integrity, we must issue a COMMIT only when the processing
reaches a logical end.
Checkpoint-Restart Logic:
Application Programs which process large volumes of data and those that involve modification
of Db2 Tables need to incorporate a Restart Logic when a system error occurs.
22
This is to help in continuing the processing from the record next to the last successfully
processed row.
Declare 2 cursors – both with ORDER BY for the columns that form the unique index for the
table. One cursor will be used to do normal processing. The other cursor while restarting
the application is used to reposition at the record following the last saved record . This is
done using additional predicates like Where Key > Last_Key
For every row that is fetched, processing is done, tables are updated. Then based on
elapsed time since last commit or commit frequency, etc, COMMIT is issued
Before every COMMIT is issued, the ckpt_restart table is updated with the Current key for
Which processing is complete and the current timestamp.
If the program is restarted, it reads the key in the ckpt table and uses that in 2nd cursor to
Reposition. Then continues processing.
After the processing is complete for all the rows, the ckpt table keys are set to default
Values.
If File inputs are used for processing, then we need to reposition the input record by saving
the read_count in the ckpt table. And upon restart perform read in a loop till the count is >
the read_count.
Or else is the file is sorted on the same key that is stored in the ckpt table, we can
Reposition to the next key.
If there are File outputs, they need to be have a disp=Mod in the jcl to continue appending
The records.
Dynamic SQLs
Static SQL is one wherein the SQL is hardcoded and only the values of host variables used in
predicates can change during execution.
Static SQL is prepared before execution.
In Dynamic SQL, the columns, tables, and predicates are decided at run time.
Dynamic SQL is prepared at the time of execution.
Execute Immediate –
They are used to implicitly prepare and execute complete sql statements.
This cannot be used to SELECT data.
They are not useful if the same statement has to be executed multiple times.
Because they are destroyed immediately after execution. And must be Prepared and Executed
again.
01 SQL-TEXT.
23
03 SQL-LEN PIC S9(4) COMP.
03 SQL-TXT PIC X(100) COMP.
EXEC SQL
PREPARE STMT1 FROM :SQL-TEXT
END_EXEC
EXEC SQL
EXECUTE STMT1
END_EXEC
EXEC SQL
PREPARE STMT1 FROM :SQL-TEXT
END_EXEC
Fixed-List Select –
To Explicitly prepare and execute sql select statements when the columns to be retrieved are
fixed.
‘Select Projno, Projname, Empno from Proj where Projno = ? and Proj_Start_Date = ?’ to
SQL-TEXT.
Exec SQL
Declare CSR Cursor for FSQL
End-Exec
Exec SQL
Prepare FSQL from :SQL-TEXT
End-Exec
Exec SQL
Open CSR using :var1 :var2
24
End-Exec
Exec SQL
Fetch CSR into :Ws-Projno :Ws-Projname :Ws-Empno
End-Exec
Note: This is like a static sql only. But suppose at run time, we need to execute the following
‘Select Projno, Projname, Empno from Proj where Projno = ? and Proj_End_Date = ?’ to
SQL-TEXT.
Since the 2 predicates are of same datatype, we can use the same var2. Instead we need to
use a different OPEN for the 2nd type.
Varying-List Select -
To Explicitly prepare and execute sql select statements when the columns to be retrieved are
also not fixed.
SQLDA – This is the SQL Descriptor Area used by DB2 to communicate information about the
Dynamic sql (fixed-List and varying-list Select) to an application program like – type of SQL,
the number and data type of columns being returned by the SQL.
This is included as
Exec SQL
Include SQLDA
End-Exec
We use SQLDA to code the PREPARE, FETCH statements and also have steps to store host
variable addresses in SQLDA.
Suppose an application has to read a sql from a terminal and execute it.
Have 2 SQLDAs - One is the full include and the other with only minimal info – MINSQLDA.
Exec SQL
Prepare FSQL Into MINSQLDA From :SQL-TEXT
End-Exec
If SQLD in MINSQLDA = 0
/* no variables used ,i.e, a non-select */
Exec SQL
Execute Immediate :SQL-TEXT
End-Exec
Else
Exec SQL
Declare CSR Cursor for FSQL
End-Exec
Exec SQL
Prepare FSQL Into SQLDA From :SQL-TEXT
End-Exec
25
Exec SQL
OPEN CSR
End-Exec
Exec SQL
Fetch CSR Using Descriptor SQLDA
End-Exec
26
Program Preparation Steps
Program Preparation –
This involves preparation of an executable load module and a DB2 application plan for each
application .
They can be used only together.
DCLGEN Command when used for a particular table reads the DB2 Catalog to determine the
table definition and builds a Cobol copybook . This copybook contains the Declare Table
statement and host variable definitions for the columns in that table.
Precompilation:
Done by Precompiler
Expands DB2 INCLUDE members
Comments out all the Embedded Sql Text and for every executable Sql, a call is
added to the DB2 runtime interface module – DSNHLI along with the necessary
parameters.
Extracts all the sql statements and places them in Database Request Module –
DBRM
Places a unique timestamp token in the modified source and DBRM
Reports the success or failure of the precompile process.
Db2 need not be operational at the time of compilation.
Compilation :
The modified source code is compiled using the standard Cobol Compiler.
Db2 need not be operational at the time of compilation.
Bind :
Invokes the DB2 Optimiser and the execution happens in the Relational Data
Services Component of DB2
It reads each sql statement from the DBRM and comes up with an optimized
access path for that sql.
Bind is of 2 types –
Bind Package accept as input a single DBRM and it produces and execution logic
for only the sqls in that DBRM. It is not executable. One or more packages are bound
into an application plan .
Bind Plan accepts as input one or more DBRMs and one or more packages
produced by Bind Package. It produces an executable sql logic for all the
Sqls in the DBRMs attached to it.
Plan is the unit of execution, but can be used only with the corresponding
Load module.
27
Bind Authorisation of the user is validated.
Determines the Optimal Access Path for each SQL Statement based on the DB2
Catalog Statistics (such as the availability of indexes, organization of data, table
Size).
Linkage:
A DB2 Program has 2 components used in execution – one is load module and the other is
application plan.
SYSDBRM has information on every DBRM which is bound into a package or a plan.
SYSSTMT has all the SQL statements of a DBRM which are bound into a plan.
SYSPACKSTMT has all the SQL statements of a DBRM which are bound into a package.
During BIND PLAN, the following DB2 Catalog tables are read –
SYSCOLDIST
SYSCOLDISTSTATS
SYSCOLSTATS
SYSCOLUMNS
SYSINDEXES
SYSINDEXSTATS
SYSPLAN SYSPACKAGE
SYSPLANAUTH SYSPACKAUTH
SYSTABLES
SYSTABLESPACE
SYSTABSTATS
SYSUSERAUTH
And information about the plan is stored in the following DB2 tables –
SYSDBRM
SYSPACKAUTH
SYSPACKLIST
SYSPLAN SYSPACKAGE
SYSPLANAUTH SYSPACKAUTH
SYSPLANDEP SYSPACKDEP
SYSPLSYSTEM SYSPKSYSTEM
SYSSTMT SYSPACKSTMT
SYSTABAUTH
28
During Bind Package, the package is associated with a specific collection id.
(1) The same DBRM can be bound into different packages which are associated to
different collection ids and using different qualifiers during BIND. And both the collections
are used in the BIND PLAN. At run time, the same application can be made to access
different sets of DB2 objects by using ‘SET CURRENT PACKAGESET = COLLECTION ID’
(2) Logically related DB2 programs’s packages are grouped into the same collection.
29
BIND Parameters –
(1) Qualifier – The ID specified here qualifies all the tables which are referred to in that
DBRM. If Qualifier is not specified, the default qualifer is the owner which is the
primary/secondary auth id.
(2) ACTION - ADD /REPLACE
(3) Isolation Level – specifies the mode of page locking used when a pgm is executed.
They are of 4 types –
CS – Cursor Stability
This releases page locks used for read as soon as another page is accessed. This will
help in improving Concurrency.
UR – Uncommitted Read
Also called Dirty Read. This facilitates accessing DB2 data without taking any Locks.
That is access data which is being changed by some other application.
This avoids concurrency pbms, but doesn’t guarantee accurate data.
This isolation applies only to Read statements. For others CS is applied.
RR – Repeatable Read
This is the default . All page locks are held till a COMMIT is issued.
This is useful only when the application requires that the same row be accessed more
than once and the data fetched during each fetch have to be consistent . Therefore it
is essential to ensure the data integrity.
RS – Read Stability
Similar to RR. It doesn’t allow the page lock to be released till COMMIT point. But
allows new data to be inserted.
(4) ACQUIRE and RELEASE – These 2 parameters specify the mode of tablespace locking
used. And they are specified only at Plan level and not Package level.
ACQUIRE –
Use - ts locks are acquired when the tablespace is first accessed -- Default
Allocate – ts locks are acquired when the plan is allocated.
RELEASE –
Commit – ts locks are released during Commit or Rollback. -- Default
Deallocate – ts locks are released when the plan is deallocated.
(5) VALIDATE - Method of checking the existence and validity of DB2 tables and DB2
access authorization.
Bind - The validation done at the time of Bind
Run - The validation done each time the plan is executed
(6) FLAG(I) – This returns all the information – warning, error and completion messages,
indicating the success or failure of Bind.
30
(7) EXPLAIN(YES) -
(8) RETAIN – For existing plans, we should specify the Retain parm to retain the Bind and
Execution authority already granted for this plan. Otherwise all the authority will be
revoked.
31
DB2 Utilities
DB2 Logs – DB2 keeps a log of all changes made to TS. All updates are recorded in
DB2 active log. When the active log becomes full, it creates an archive log. There
could be multiple archive logs created during application processing. All the info is
stored in the DB2 Directory’s SYSIBM.SYSLGRNG table.
(8) LOAD Utility – used to do bulk inserts into table . It can add or replace existing data.
(8) RUNSTATS – This utility collects statistical information for Tables, Tablespaces,
32
Partitions, Indexes and Columns in tables. It can be used to Update all the information
In the relevant catalog tables or can be used simply for generating all the information
In the form of a report.
It is this statistical data which is used by the DB2 optimiser to arrive at an optimal
Access path.
It is good to run RUNSTATS after every LOAD.
After running RUNSTATS, the plans and packages with static SQL have to be rebound
to have the access paths based on the recent statistics.
(9) REORG – Used to reorganize DB2 Tablespaces and Indexes, thereby improving the
Efficiency of access to those objects. REORG reclusters data, resets free space to the
Amount specified in CREATE DDL.
Catalog Tables:
33