2.7 - db2 Purexml
2.7 - db2 Purexml
Summer/Fall 2010
DB2® pureXML
Information Management
Agenda
■ Overview of XML
■ pureXML in DB2
■ XML Data Movement in DB2
■ XQuery and SQL/XML
■ XML Indexes in DB2
■ Application Development
What is XML?
<book>
<authors>
■ eXtensible Markup Language <author id="47">John Doe</author>
– XML is a language designed <author id="58">Peter Pan</author>
</authors>
to describe data <title>Database systems</title>
</book>
■ A hierarchical data model
Characteristics of XML
Insurance Telecommunications
ACORD eTOM, NGOSS, etc.
XML for P&C, Life +++ Parlay Specification +++
Financial Markets Automotive
FIX Protocol, FIXML, MDDL, ebXML, Energy & Utilities
RIXML, FpML +++ other B2B Stds. IEC Working Group 14
Multiple Standards
Cross Industry Chemical & Petroleum CIM, Multispeak
PDES/STEPml Chemical eStandards
SMPI Standards CyberSecurity
RFID, DOD XML+++ PDX Standard+++
Root element
<book>
<authors> Attribute
<author id=“47”>John Doe</author>
<author id=“58”>Peter Pan</author>
</authors>
<title>Database systems</title>
<price>29</price>
Element
<keywords>
<keyword>SQL</keyword>
<keyword>relational</keyword>
</keywords>
</book>
Text node (Data)
customerInfo
customer customer
1.3.2
1.3.1.3
1.2.1.1.5.3
12 © 2010 IBM Corporation
Information Management
...
DAT Object
deptID ... custDoc
A001 ...
A002 ... Region
... ... Index
XDA Object
Like LOBs, XML
data is stored
separately from
the base table
(unless inlined)
■ Explicit XMLPARSE
– Transform XML value from serialized (text) form into internal
representation.
– Tell system how to treat whitespaces (strip/preserve)
• Default is 'Strip WHITESPACE'
/data/dept.del
import from /data/dept.del of del 1000,"<XDS FIL=‘C1.xml' />"
XML from /data/xmlfiles
1001,"<XDS FIL=‘C2.xml' />"
insert into dept
1002,"<XDS FIL=‘C3.xml' />"
1003,"<XDS FIL=‘C4.xml' />"
1004,"<XDS FIL=‘C5.xml' />"
/data/xmlfiles
/data/xmlfiles/C1.xml
/data/xmlfiles/C2.xml
Directory that includes /data/xmlfiles/C3.xml
the XML files that are /data/xmlfiles/C4.xml
referenced in the DEL file
/data/xmlfiles/C5.xml
dept
1000 <dept><employee><name>John Doe</name>
<address><street>555 Bailey Ave</street><city>…</city><zip>95141</zip>
</address>…</employee></dept>
1001 <dept><employee><name>Kathy Smith</name> …
1002 <dept><employee><name>Jim Noodle ….
What to export
dept
1000 <dept><employee><name>John Doe</name>
<address><street>555 Bailey Ave</street><city>…</city><zip>95141</zip>
</address>…</employee></dept>
1001 <dept><employee><name>Kathy Smith</name> …
1002 <dept><employee><name>Jim Noodle ….
/data/dept.del /data/xmlfiles
1000,"<XDS FIL=‘C1.xml' />" /data/xmlfiles/C1.xml
1001,"<XDS FIL=‘C2.xml' />" /data/xmlfiles/C2.xml
1002,"<XDS FIL=‘C3.xml' />" /data/xmlfiles/C3.xml
1003,"<XDS FIL=‘C4.xml' />" /data/xmlfiles/C4.xml
1004,"<XDS FIL=‘C5.xml' />" /data/xmlfiles/C5.xml
18 © 2010 IBM Corporation
Information Management
XPath
<customerInfo>
Path Table
<cusotmer id ="1">
<name>Victor</name> /
<sex>M</sex>
<phone type="work">739- /customerInfo
1274</phone>
/customerInfo/customer/@id
</customer>
<customer id ="2"> Parse /customerInfo/customer/name
<name>April</name>
<sex>F</sex> /customerInfo/customer/sex
<phone type="home">983-
2179</phone>
/customerInfo/customer/phone
</customer> customerInfo /
</customerInfo> customerInfo/customer/phone/@type
customer customer
Introduction to XQuery
■ Unlike relational data (which is predictable and has a regular
structure), XML data is:
– Often unpredictable
– Highly variable
– Sparse
– Self-describing
1002
1003
xquery
db2-fn:xmlcolumn("XMLCUSTOMER.INFO")/customerinfo/name;
name
… …
name
… …
24 © 2010 IBM Corporation
Information Management
xquery
db2-fn:sqlquery("SELECT INFO FROM
XMLCUSTOMER
WHERE CID=1001")/customerinfo/name;
name
SQL/XML Functions
■ XQuery can be invoked from SQL
– XMLQUERY()
– XMLTABLE()
– XMLEXISTS()
1
<name>...</name>
<name>...</name>
...
27 © 2010 IBM Corporation
Information Management
SELECT
XMLQUERY(‘$i/customerinfo/name’
PASSING INFO AS “i”)
FROM CUSTOMER
<customerinfo>
<name>John Smith</name> NAME STREET CITY
<addr country=“Canada">
<street>Fourth</street> Amir Malik Young Toronto
<city>Calgary</city> John Smith Fourth Calgary
<prov-state>Alberta</prov- … … …
state>
<pcode-zip>M1T 2A9</pcode-zip>
</addr>
<phone type="work">
963-289-4136
</phone>
</customerinfo>
29 © 2010 IBM Corporation
Information Management
<customerinfo>
<name>John Smith</name> CID INFO
<addr country=“Canada">
<street>Fourth</street> 1003
<city>Calgary</city>
<prov-state>Alberta</prov-
state>
<pcode-zip>M1T 2A9</pcode-zip>
</addr>
<phone type="work">
963-289-4136
</phone>
30 © 2010 IBM Corporation
</customerinfo>
Information Management
XML Indexes
■ An index over XML data can be used to improve the
efficiency of queries on XML documents.
–Index entries will provide access to nodes within the
document by creating index keys based on XML pattern
expressions.
■ Like relational data they may have some cost.
– Performance for INSERT, UPDATE and DELETE
– Space needed to store the indexes
Regions
Index
C or C++
SQL
Procedures COBOL
C# and
Perl
Visual Basic
PHP
33 © 2010 IBM Corporation
Information Management
XML – Conclusion
■ Native XML hierarchical storage
–No shredding, no CLOBs, no BLOBs required
–Optimized for XPATH and XQuery (LUW Only) processing
■ High performance
–Superior indexing technology
–No parsing of XML data at query runtime
■ Fully integrated XML and relational processing
–Seamlessly query various types of data at once
–No internal translation of XQuery into SQL
E-mail: [email protected]
Subject: “DB2 Academic Workshop”
Information Management