0% found this document useful (0 votes)
1K views594 pages

XM 301 Stud

Information contained in this document has not been submitted to any formal IBM test. There is no guarantee that the same or similar results will result elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk. This document may not be reproduced in whole or in part without the prior written permission of IBM.

Uploaded by

Narendar Reddy
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views594 pages

XM 301 Stud

Information contained in this document has not been submitted to any formal IBM test. There is no guarantee that the same or similar results will result elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk. This document may not be reproduced in whole or in part without the prior written permission of IBM.

Uploaded by

Narendar Reddy
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 594

V3.

cover

Front cover

Introduction to XML and Related Technologies


(Course Code XM301)

Student Notebook
ERC 4.1

IBM Certified Course Material

Student Notebook

Trademarks IBM is a registered trademark of International Business Machines Corporation. The following are trademarks of International Business Machines Corporation in the United States, or other countries, or both: AFS AS/400 Database 2 DFS DRDA IMS Lotus NetRexx Open Blueprint RACF S/390 Tivoli Enterprise TME 10 VTAM AIX CICS DB2 Distributed Relational Database Architecture Encina Lotus Enterprise Integrator MQSeries Network Station OS/2 RDN SecureWay Tivoli Management Environment TXSeries WebSphere alphaWorks ClearCase DB2 Universal Database Domino Everyplace Lotus Notes MVS Notes OS/390 RS/6000 Tivoli TME VisualAge

Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. SET and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC. Other company, product and service names may be trademarks or service marks of others.

July 2004 Edition


The information contained in this document has not been submitted to any formal IBM test and is distributed on an as is basis without any warranty either express or implied. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customers ability to evaluate and integrate them into the customers operational environment. While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will result elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk. Copyright International Business Machines Corporation 2001, 2004. All rights reserved. This document may not be reproduced in whole or in part without the prior written permission of IBM. Note to U.S. Government Users Documentation related to restricted rights Use, duplication or disclosure is subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp.

V3.1
Student Notebook

TOC

Contents
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Course Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Agenda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Unit 1. Introduction to XML and Related Technologies . . . . . . . . . . . . . . . . . . . . 1-1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 Course Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3 Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5 Course Objectives (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6 Course Objectives (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7 Agenda - Day 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8 Agenda - Day 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9 Agenda - Day 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11 Unit 2. Issues in Electronic Information Exchange . . . . . . . . . . . . . . . . . . . . . . . 2-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2 Electronic Information Exchange (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3 Electronic Information Exchange (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4 Intra-Application Information Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5 Agile Views - Multiple Client/Device Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6 Inter-Application Information Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7 Context-free Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8 B2B Intercompany Information Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9 Need to Establish Common Ground for Communication . . . . . . . . . . . . . . . . . . . 2-10 Inter-system Information Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11 Exchanging Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12 The Semantic Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13 A Common Solution? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14 Checkpoint Questions (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15 Checkpoint Questions (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-16 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17 Unit 3. What Is XML? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2 What Is XML? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3 Example Tree Representation of XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4 A Simple XML Document - Basic Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5 A Simple XML Document - Basic Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6 Basics of Well-formed XML (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7 Basics of Well-formed XML (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Contents

iii

Student Notebook

Element Rules - Rule 1. Single Root Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9 Element Rules - Rule 2. Element Tag Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-10 Element Rules - Rule 3. Element Nesting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-11 Element Nesting Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-12 Element Rules - Rule 4. XML Naming Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-13 Rule 4... Tag Naming - Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-14 Rule 4... Element Content (1 of 2): General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-15 Rule 4... Element Content (2 of 2): Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-16 Rule 4... PCDATA - Parsed Character Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-17 Rule 4... CDATA - Character Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-18 Rule 4... CDATA Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-20 Element Rules - Rule 5. Element Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-21 Element Rules - Rule 6. XML Declaration (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . .3-22 Element Rules - Rule 6. XML Declaration (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . .3-23 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-24 Internationalization and Encoding (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-25 Internationalization and Encoding (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-26 Processing Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-27 Well-formed versus Valid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-28 HTML versus XML (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-29 HTML versus XML (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-30 HTML and XML Key Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-31 Checkpoint Questions (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-32 Checkpoint Questions (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-33 Checkpoint Questions (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-34 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-35 Unit 4. WebSphere Studio Application Developer Overview . . . . . . . . . . . . . . . 4-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-2 Roles-based Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-3 Development Environment Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-4 IBM WebSphere Studio Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-5 Family Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-6 WebSphere Studio Workbench . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-7 WebSphere Studio Workbench Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-8 WebSphere Studio Application Developer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-9 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-10 Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-11 Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-12 Editors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-13 Online Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-14 Cheat Sheets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-15 Application Developer Design Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16 Tooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-17 Java IDE (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-18 Java IDE (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-19 Java IDE (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-20 J2EE Tooling (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-21
iv Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1
Student Notebook

TOC

J2EE Tooling (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Portlet Tooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Tooling (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Tooling (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Web Tooling (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Web Tooling (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XML Tooling (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XML Tooling (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XML Tooling (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Performance/Trace Tooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Team Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Web Services Tooling (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Web Services Tooling (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Standards Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4-22 4-23 4-24 4-25 4-26 4-27 4-28 4-29 4-30 4-31 4-32 4-33 4-34 4-35 4-36 4-37

Unit 5. Document Type Definition (DTD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2 Review: Well-Formed XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3 Why Do We Need DTDs? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4 What Is a DTD? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5 What is Allowed in a DTD? (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6 What is Allowed in a DTD? (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7 XML and DTD Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8 What Is Allowed. . .Declaring Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9 Element Content Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10 EMPTY Content Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11 ANY Content Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12 Elements Content Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13 Elements Content Examples (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14 Elements Content Examples (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15 Elements Content Examples (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-16 Mixed Content Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17 What Is Allowed. . .Declaring Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-18 Organizational Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-19 Attribute Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-20 Attribute Default Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-22 Attribute Default Declaration Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-23 Attribute Alternate Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-24 Attribute Types: Tokenized Types: IDREFS Example . . . . . . . . . . . . . . . . . . . . . . 5-25 Attribute Types: Tokenized Types: ENTITY Example . . . . . . . . . . . . . . . . . . . . . . 5-26 Attribute Types: Tokenized Types: ENTITIES Example . . . . . . . . . . . . . . . . . . . . 5-27 DTDs Part II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-28 Declaring ENTITYs: an Internal, Parsed ENTITYs Example . . . . . . . . . . . . . . . . . 5-29 Declaring ENTITYs: an External, Parsed ENTITYs Example . . . . . . . . . . . . . . . . 5-30 Unparsed Entity Declarations: a Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-32 Parameter ENTITYs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-33
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Contents

Student Notebook

Parameter ENTITYs - Another Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-34 What Is Allowed. . . Declaring Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-35 Joining a DTD to an XML Instance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-36 External DTD Subset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-37 Internal DTD Subset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-39 Split DTD Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-41 Whitespace and DTDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-42 Ignorable Whitespace Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-43 Validating versus Non-validating Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-44 Example DTDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-45 What's Wrong with DTDs? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-47 Status of DTDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-48 Tooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-49 Checkpoint Questions (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-50 Checkpoint Questions (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-51 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-52 Unit 6. XML Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-2 Problem: Element and Attribute Names can be Ambiguous . . . . . . . . . . . . . . . . . . .6-3 Elaboration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-4 Namespaces: The Big Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-5 XML Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-6 Qualified Names (QNames) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-7 Declaring Namespaces (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-8 Declaring Namespaces (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-9 Namespace Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-10 Default Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-11 Example - Default Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-12 Documents with Multiple Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-13 Elements with No Namespace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-14 Attributes and Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-15 Namespace Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-16 Example: Use of Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-17 Problems with Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-18 Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-19 Status of Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-21 More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-22 Checkpoint Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-23 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-24 Unit 7. XML Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-3 What Is an XML Schema? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-4 Why Do We Need XML Schema? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-6 DTD versus XML Schema (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-8 DTD versus XML Schema (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-9
vi Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1
Student Notebook

TOC

Requirements Applied to the XSD Language (1 of 3) . . . . . . . . . . . . . . . . . . . . . . Requirements Applied to the XSD Language (2 of 3) . . . . . . . . . . . . . . . . . . . . . . Requirements Applied to the XSD Language (3 of 3) . . . . . . . . . . . . . . . . . . . . . . Anatomy of an XML Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - Return to Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - Basic Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - Basic Schema Concepts . . . . . . . . . . . . . . . . . . . . . . Simple Types Built-in to XML Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - How Studio Sees It . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 1 (1 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 1 (2 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 1 (3 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 1 (4 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 1 (5 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 1 (6 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 1 (7 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 2 (1 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 2 (2 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 2 (3 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 2 (4 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 3 (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 3 (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 4 (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 4 (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - Connecting the Schema to the Instance . . . . . . . . . . What's Next? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .but first. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XML Schema Part II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Before We Begin: Some Notes about Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . Complex Type Definitions, Element and Attribute Declarations . . . . . . . . . . . . . . Parts of XSD Speech (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Parts of XSD Speech (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Resetting Expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Simple Type (simpleType) Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . All the Built-in Simple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Creating New Simple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Facets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example: Constraining Facets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example: Enumeration Facet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . simpleContent and Empty Complex Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . simpleContent Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Complex Type Definition (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Complex Type Definition (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Named versus Anonymous Types (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Named versus Anonymous Types (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Declaring Child Elements in complexType Elements . . . . . . . . . . . . . . . . . . . . . . Element Declaration: Common Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . minOccurs and maxOccurs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

7-10 7-12 7-13 7-14 7-15 7-16 7-17 7-18 7-19 7-20 7-22 7-23 7-24 7-25 7-26 7-27 7-28 7-29 7-30 7-31 7-32 7-33 7-34 7-35 7-36 7-38 7-39 7-40 7-41 7-42 7-43 7-44 7-45 7-46 7-47 7-49 7-50 7-52 7-53 7-54 7-55 7-56 7-57 7-58 7-59 7-60 7-61 7-62
vii

Contents

Student Notebook

Example: minOccurs and maxOccurs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-63 1.4 Attribute Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-64 Declaring Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-65 Example: Attribute Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-66 Example: An Element with Attributes (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-67 Example: An Element with Attributes (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-68 2.1 Attribute Group Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-69 Anonymous Types in Attribute Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-70 2.1 Attribute Group Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-71 Attribute Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-72 2.3 Model Group Definitions (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-73 2.3 Model Group Definitions (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-74 2.4 Notation Declarations (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-75 2.4 Notation Declarations (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-76 3.1 Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-77 3.2 Model Groups (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-78 3.2 Model Groups (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-79 Example: Compositors (Model Groups) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-80 Model Groups and Compositors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-82 Example: Global Definitions and Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-84 3.5 Attribute Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-85 Part III Associating a .xsd with a .xml . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-86 . . .but first. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-87 Namespaces, Schemas and Qualification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-88 Namespaces, Schemas and Qualification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-89 Putting a Schema in a Namespace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-90 XML Schemas and Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-91 Target Namespace and Schema Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-92 Finding the Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-93 Best Practices (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-95 Best Practices (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-97 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-98 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-99 Unit 8. XPath - XML Path Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-2 What Is XPath? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-3 Why Is It Called XPath? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-4 Example Tree Representation of XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-5 XPath Expression Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-6 XPath Current Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-7 XPath Step Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-8 XPath Address Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-9 Example: Absolute Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-10 Example: Absolute Addressing with Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . .8-11 Relative Addressing using Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-12 Example: Relative Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-13 XPath - The Thirteen Axes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-15
viii Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1
Student Notebook

TOC

Abbreviated Step Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XPath - Partitioning the Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example: Addressing with Axes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XPath Axis Node Type and Node Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sample Node Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XPath - Predicates (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XPath - Predicates (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predicate Core Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predicate String Functions (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predicate String Functions (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predicate Number and Boolean Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reference Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Checkpoint Questions (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Checkpoint Questions (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Checkpoint Questions (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8-16 8-17 8-18 8-19 8-20 8-21 8-22 8-23 8-24 8-25 8-26 8-27 8-28 8-29 8-30 8-31

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT) . . . . . . . . . . . 9-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2 Why Do We Need XSL Transformations? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3 Why Do We Need XSL? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4 XSL: Three Parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-5 XSLT Language Characteristics (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-7 XSLT Language Characteristics (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-9 XSLT Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-10 XSL Transformations (XSLT) Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-11 The XSLT Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-13 Anatomy of a Stylesheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-15 Elements to Generate Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-16 <xsl:stylesheet Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-17 XSL Optional Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-18 <xsl:template Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-19 <xsl:apply-templates Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-20 Pattern Matching (XPath) Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-21 Default of <xsl:apply-templates /> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-23 <xsl:value-of Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-24 Control Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-25 XML Input As a Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-26 Desired HTML Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-27 XML to HTML (1 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-28 XML to HTML (2 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-29 XML to HTML (3 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-30 XML to HTML (4 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-31 XML to HTML (5 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-32 Calling <xsl:apply-templates/> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-33 Named Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-34 <xsl:for-each Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-35 Time for a Lab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-36
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Contents

ix

Student Notebook

<xsl:if Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-37 <xsl:choose Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-38 <xsl:choose Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-39 Elements to Generate Output (XML to XML) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-40 <xsl:element Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-42 <xsl:attribute> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-43 XML to XML Example (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-44 XML to XML Example (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-45 Numbers, Sorting, and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-46 Working with Numbering in XSLT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-47 <xsl:number Element format Attribute Values . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-49 <xsl:number Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-50 <xsl:sort Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-51 <xsl:sort Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-52 Sort Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-53 XPath/XSLT Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-54 Other Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-56 Attribute Value Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-57 Attribute Value Templates Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-58 XSLT Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-59 Xalan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-60 XSL Resources from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-61 XSL References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-62 Checkpoint Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-63 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-64 Appendix A. Introduction to Databases and XML . . . . . . . . . . . . . . . . . . . . . . . . . A-1 Appendix B. Additional Information for XML Schema . . . . . . . . . . . . . . . . . . . . . . B-1 Appendix C. Whats New in WebSphere Studio V5.1.1 . . . . . . . . . . . . . . . . . . . . . C-1 Appendix D. Additional Information and Examples . . . . . . . . . . . . . . . . . . . . . . . . D-1 Appendix E. Bibliography and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-1 Appendix F. Acronyms and Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-1 Appendix G. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G-1 Appendix H. Checkpoint Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H-1

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1
Student Notebook

TMK

Trademarks
The reader should recognize that the following terms, which appear in the content of this training document, are official trademarks of IBM or other companies: IBM is a registered trademark of International Business Machines Corporation. The following are trademarks of International Business Machines Corporation in the United States, or other countries, or both: AFS AS/400 Database 2 DFS DRDA IMS Lotus NetRexx Open Blueprint RACF S/390 Tivoli Enterprise TME 10 VTAM AIX CICS DB2 Distributed Relational Database Architecture Encina Lotus Enterprise Integrator MQSeries Network Station OS/2 RDN SecureWay Tivoli Management Environment TXSeries WebSphere alphaWorks ClearCase DB2 Universal Database Domino Everyplace Lotus Notes MVS Notes OS/390 RS/6000 Tivoli TME VisualAge

Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. SET and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC. Other company, product and service names may be trademarks or service marks of others.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Trademarks

xi

Student Notebook

xii

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1
Student Notebook

pref

Course Description
Introduction to XML and Related Technologies Duration: 2.5 days Purpose
This course provides an introduction to XML (eXtensive Markup Language) and related technologies. Students will gain conceptual and practical knowledge of the concepts that are required to work with XML. The course will build the basic skills to enable architects, designers, analysts, developers, testers, and administrators to use XML and its related technologies in the context of building e-business applications. The course is a 2.5-day classroom course with hands-on lab exercises that reinforce the lecture material.

Audience
This course is designed for information technology individuals, including enterprise application architects, designers, developers, and content modelers and creators.

Prerequisites
Knowledge of Internet technologies is required. Some experience with using HTML would be helpful, but is not necessary.

Objectives
After completing this course, you should be able to: Describe the important XML standards and recommend their use in business applications Define XML documents using namespaces, DTD, or Schema Develop and test XML processing applications Use XSLT to transform XML documents as necessary Identify open areas in XML, such as security, and emerging technologies such as DB support, XHTML, Web Services, XLink, and so forth. Plan for their incorporation into XML processing applications Identify where XML fits in application architectures

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Course Description

xiii

Student Notebook

xiv

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1
Student Notebook

pref

Agenda
Day 1
Unit 1 - Introduction to XML and Related Technologies Unit 2 - Issues in Electronic Information Exchange Unit 3 - What Is XML? XML Basics Lab Unit 4 - WebSphere Studio Application Developer Overview Introduction to WebSphere Studio Application Developer Lab Unit 5 - Document Type Definition (DTD) DTD Lab Unit 6 - XML Namespaces XML Namespaces Lab

Day 2
Unit 7 -XML Schema XML Schema Lab Unit 8 - XPath - XML Path Language XPath Lab Unit 9 - XSL - eXtensible Stylesheet Language Part 1 XSLT Lab Part 1 - Simple Transforms

Day 3
Unit 9 - XSL - Extensible Stylesheet Language Part 2 XSLT Lab Part 2 - Conditional Transforms Introduction to Databases and XML (Optional Unit)

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Agenda

xv

Student Notebook

xvi

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit 1. Introduction to XML and Related Technologies


What This Unit is About
This unit describes the audience, prerequisites, and overall objectives for XM301. The overall agenda for the course is also covered.

What You Should Be Able to Do


After completing this unit, you should be able to: Describe the target audience for XM301 Explain the prerequisites for the course Describe the major objectives for the course Describe the agenda for this course offering

Copyright IBM Corp. 2001, 2004

Unit 1. Introduction to XML and Related Technologies

1-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Introduction
XM301 Introduction to XML and Related Technologies Instructor:

Please introduce yourself and provide your:


Name and organization Job Role Experience with Markup languages Goals you hope to achieve

Copyright IBM Corporation 2004

Figure 1-1. Introduction

XM3014.1

Notes:

1-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Course Description
This course is designed to introduce students to the fundamentals of XML and its significant derivative companion technologies: XML Schema, Namespaces, XPath, and XSL Transformations. Document Type Declarations (DTDs) are also introduced. The focus of the course is on the creation, specification and processing of XML documents. The course is 2.5 days in length and provides extensive hands-on labs throughout. It is expected that additional, after class work (see notes below) will be required to adequately understand the material we will introduce.

Copyright IBM Corporation 2004

Figure 1-2. Course Description

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 1. Introduction to XML and Related Technologies

1-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Audience
The course is targeted to Information Technology professionals involved in the exchange of information using XML as the data transport mechanism.

Copyright IBM Corporation 2004

Figure 1-3. Audience

XM3014.1

Notes:

1-4

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Prerequisites
Prerequisites: There are no specific prerequisites for this course. However, some familiarity with markup languages is recommended.

Copyright IBM Corporation 2004

Figure 1-4. Prerequisites

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 1. Introduction to XML and Related Technologies

1-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Course Objectives (1 of 2)
After completing this course, you should be able to: Describe/differentiate the use of HTML and XML Enumerate the rules of a well-formed XML document Create and maintain XML documents Describe the purpose and use of Document Type Definitions (DTDs) Create DTDs describing the validation rules for specific XML instances* Describe the purpose and use of XML Schema Enumerate the benefits of XML Schema over DTDs Create XML Schemas describing the validation rules for specific XML instances* *...using IBM WebSphere Studio Application Developer

Copyright IBM Corporation 2004

Figure 1-5. Course Objectives (1 of 2)

XM3014.1

Notes:

1-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Course Objectives (2 of 2)
After completing this course, you should be able to: Describe the purpose of XML Namespaces Declare and use XML Namespaces in an XML document* Describe the use of an XPath in the context of XSLT and XML Schema Create XPath expressions that locate specific information in an XML instance* Describe the use of XSL in the processing of XML documents Create an XSL Transformation to transform an XML document into some other instance* *...using IBM WebSphere Studio Application Developer

Copyright IBM Corporation 2004

Figure 1-6. Course Objectives (2 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 1. Introduction to XML and Related Technologies

1-7

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Agenda - Day 1
Welcome and Introductions Issues in Information Exchange What is XML? Lab Exercise Overview of IBM WebSphere Studio Application Developer Lab Exercise Document Type Definitions Lab Exercise XML Namespaces Lab Exercise

Copyright IBM Corporation 2004

Figure 1-7. Agenda - Day 1

XM3014.1

Notes:

1-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Agenda - Day 2
XML Schema Lab Exercise XPath Lab Exercise XSL Transformation - Part 1 Lab Exercise

Copyright IBM Corporation 2004

Figure 1-8. Agenda - Day 2

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 1. Introduction to XML and Related Technologies

1-9

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Agenda - Day 3
XSL Transformation - Part 2 Lab Exercise

Copyright IBM Corporation 2004

Figure 1-9. Agenda - Day 3

XM3014.1

Notes:

1-10 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit Summary
We've looked at the overall course objectives and a day-by-day agenda. Let's get started.

Copyright IBM Corporation 2004

Figure 1-10. Unit Summary

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 1. Introduction to XML and Related Technologies

1-11

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

1-12 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit 2. Issues in Electronic Information Exchange


What This Unit is About
This unit examines the different ways in which information is exchanged in modern computer systems, identifying issues in each case. The discussion is restricted to what is exchanged (the content) not how it is exchanged (the mechanism). A set of messaging criteria are developed that, if met, will reduce the impact of the issues identified. This unit shows some of the business drivers for XML, and gives examples of how XML is being used by businesses today.

What You Should Be Able to Do


After completing this unit, you should be able to: Describe the types of information exchange that occur in modern computer systems Describe information exchange issues that exist in modern computer systems Describe what is needed to address many of the issues that exist in information exchange

How You Will Check Your Progress


Accountability: In class discussion Checkpoint

Copyright IBM Corp. 2001, 2004

Unit 2. Issues in Electronic Information Exchange

2-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Unit Objectives
After completing this unit, you should be able to: Describe the types of information exchange that occur in modern computer systems Describe information exchange issues that exist in modern computer systems Describe what is needed to address many of the issues that exist in information exchange

Copyright IBM Corporation 2004

Figure 2-1. Unit Objectives

XM3014.1

Notes:

2-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Electronic Information Exchange (1 of 2)


Electronic information exchange is a simple concept: Electronically encoded information of one sort or another moves among software units during the execution of some domain(business) related function. There are several contexts for information exchange: Intra-application - information movement among the parts of an application. Inter-application - information movement between applications in the same company system. Intercompany - information movement between companies. Inter-system - information movement between systems in the same company. There are problem dimensions in each context that shape the way information is exchanged. Some of the problems are common but each context also has unique issues to deal with.

Copyright IBM Corporation 2004

Figure 2-2. Electronic Information Exchange (1 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 2. Issues in Electronic Information Exchange

2-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Electronic Information Exchange (2 of 2)


Company 1
System 2 (Accounting) Intercompany Inter-System System 1 (Sales) Application (Ordering)
Intra-Application

Company 2

Company 3

Inter-Application Application (CRM)

Company n

Copyright IBM Corporation 2004

Figure 2-3. Electronic Information Exchange (2 of 2)

XM3014.1

Notes:

2-4

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Intra-Application Information Exchange


In a well-structured application, information flows between three different layers: The presentation layer (often called the View): presents information to the user and collects information from the user. This layer is often coupled to a particular presentation technology, for example, Presentation Manager, X-Windows, and so forth. Therefore, it often must change significantly when the presentation mode changes. The processing layer (often called the Controller): operates on the information in accordance with the functional requirements of the application. The business layer (often called the Model or Business Model): maintains the operational constraints that govern the business as a whole. It ensures that no individual application contradicts those rules by performing an operation that is inconsistent with those constraints.
Presentation Layer (View) Process Layer (Controller) Business Layer (Model)

In this context, the biggest Information Exchange issues occur with the View

Copyright IBM Corporation 2004

Figure 2-4. Intra-Application Information Exchange

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 2. Issues in Electronic Information Exchange

2-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Agile Views - Multiple Client/Device Support


Prior to the arrival of the World Wide Web, applications were largely presented via workstations or dumb terminals, and required (relatively) infrequent modification of their presentation layer. The World Wide Web has changed this. Now, the addition of the mobile work force and use of handheld devices presents new opportunities for business and new challenges for application developers. Applications must be presented via: Cell phones and Handhelds, Wireless Markup Language (WML) Web Browsers (HTML, Style Sheets, JavaScript) PDF And so forth Many Web applications suffer from coupling issues where applications habitually generate output that combines Presentation information (font, color, and so forth) with business information (bank balance, product information, and so forth) making it difficult to reuse the data stream. Ideally, the presentation layer would emit/consume a generic, structured information stream that can be filtered for the target device. An external rendering engine worries about how it looks, while the application worries about what should be viewed. Enables speedy, low-cost support for new client devices.

Need a View-independent, structured information stream


Copyright IBM Corporation 2004

Figure 2-5. Agile Views - Multiple Client/Device Support

XM3014.1

Notes:

2-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Inter-Application Information Exchange


Ideally, the design of a system takes into account all the operations that it will perform and the applications that will perform them. It is rare that enough information exists to perform such an analysis and rarer still that the design remains stable as the applications that compose the system are constructed (typically at disparate points in time). Technology does not stand still; it is common to see applications built late in the life of a system using technology that is completely different from that used by the initial ones, for example, COBOL versus Java. Experience has shown that it is best to focus on the application at hand and allow the plans for a system to evolve as further applications are built based on new knowledge of the problem and new technologies.

The way that applications communicate should not make assumptions about implementation technology or how information will be used

Copyright IBM Corporation 2004

Figure 2-6. Inter-Application Information Exchange

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 2. Issues in Electronic Information Exchange

2-7

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Context-free Communication
As much as possible, eliminate assumptions from the way in which information is exchanged. This means that the information that flows between applications should not be coupled to a particular technology or to an assumption about how it will be used. When possible, send an application domain entity, for example, a Purchase Order rather than the individual pieces, for example, a total, an item description, and so forth. Don't use a message that is bound to an implementation technology, for example, a Serialized Java Object (a Java-specific bit stream). Ideally, the communication medium would be based on simple, ubiquitous technology, for example, straight text. Should be structured and self-describing to eliminate the need for context awareness in the receiver.

Requires a structured information (text) format that supports the expression of semantics
Copyright IBM Corporation 2004

Figure 2-7. Context-free Communication

XM3014.1

Notes:

2-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

B2B Intercompany Information Exchange


In this case, the presentation is focussed on the Business to Business (B2B) relationships that exist in e-business. In such cases, the systems involved often talk to multiple business partners; sometimes for the same service where selection is based on price, availability, and so forth, for example, Credit Transaction Validation.

C2 C1 C3 C1 M

C2 C3

Cn
Scenario 1 Communicate directly with business partners, potential for 'n' communication protocols

Cn
Scenario 2 Communicate with business partners through an intermediate 'Marketplace' vendor. Forced to evolve at the rate of the intermediary

Copyright IBM Corporation 2004

Figure 2-8. B2B Intercompany Information Exchange

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 2. Issues in Electronic Information Exchange

2-9

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Need to Establish Common Ground for Communication


Technology issues aside, it is clear that successful, unfettered B2B information exchange depends greatly on the creation of implementation-independent, vendor-neutral languages in which to conduct business. Markup languages have existed as a means to embed semantics in electronic documents (for example, SGML). SGML was created as a language for describing documents. B2B communication may benefit from a similar solution, that is, use a markup language to describe information. Such a language could be used to describe documents that whole industries agree on as a means to exchange the information they need to conduct business.
Requires an implementation-independent, vendor-neutral markup language for describing information; enabling the creation of domain-specific business languages.
Copyright IBM Corporation 2004

Figure 2-9. Need to Establish Common Ground for Communication

XM3014.1

Notes:

2-10 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Inter-system Information Exchange


The exchange of information between systems is subject to most of the problems discussed so far except, perhaps, view coupling. Typically, this sort of communication does not involve a presentation layer. When laying out the infrastructure in which systems will reside, it is wise to establish a means of insulating systems from one another with a layer that is devoid of implementation and process coupling ... let's call this the Interface Layer (it's also known as an Abstraction of the System). The role of the interface layer is to capture the semantics of a system as seen from an external point of view, and to represent it as a dialog, with messages providing the units of communication in the dialog. As long as the definition of the system doesn't change, the dialog (the interface to the system) should remain stable. The implementation may change significantly.

System1

System2

Need a way to exchange messages in an implementation-independent way


Copyright IBM Corporation 2004

Figure 2-10. Inter-system Information Exchange

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 2. Issues in Electronic Information Exchange

2-11

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Exchanging Messages
Exchanging messages between systems has a lot in common with exchanging messages between B2B business partners. The exception is that though inter-system information exchange requires an established protocol (the interface), the system does not necessarily benefit from that protocol being an accepted standard for B2B communication
There are other differences, for example, the likely use of Message Oriented Middleware in system integration (MOM), but this presentation is focused on the information being exchanged not on the exchange mechanism.

So, in common with B2B communication:

Requires an implementation-independent, vendor-neutral markup language for describing information.

Copyright IBM Corporation 2004

Figure 2-11. Exchanging Messages

XM3014.1

Notes:

2-12 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

The Semantic Web


An extension of the WWW. The Web becomes an active (rather than passive) information space. Separation of content from presentation is necessary. That is, Model-View separation. HTML doesn't have this. Look at the browser compatibility problem as evidence for the need for this. In order for the Web to reason, it must be possible to identify the units that are going to be reasoned about. HTML doesn't help with this. Need self-describing data.
Requires self-describing information decoupled from View details

Copyright IBM Corporation 2004

Figure 2-12. The Semantic Web

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 2. Issues in Electronic Information Exchange

2-13

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

A Common Solution?
Collecting all the observations together, an information solution addressing each of these issues would be: a. A view-independent, structured information stream. b. A structured information (text) format that supports the expression of semantics. c. An implementation-independent, vendor-neutral markup language for describing information, enabling the creation of domain-specific business languages. d. Self-describing, decoupled from view details. In short: "A text-based, vendor-neutral markup language that supports the expression of semantics." Such a language would address many information exchange issues if it gained significant industry acceptance.

Copyright IBM Corporation 2004

Figure 2-13. A Common Solution?

XM3014.1

Notes:

2-14 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Checkpoint Questions (1 of 2)
1. Which of the following will reduce the coupling related to Electronic Information Exchange? (Select all that apply) a. Create messages that are context-free. b. Use system interfaces to hide implementation details. c. Combine view information and data in each message. d. Use messages that are vendor-neutral and implementation-independent. e. All of the above. 2. Text-based messages are preferred because: (Select all that apply) a. They are implementation-neutral. b. All software technologies can read/write them. c. It's easier to debug messaging problems. d. They can be spell checked. e. All of the above.

Copyright IBM Corporation 2004

Figure 2-14. Checkpoint Questions (1 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 2. Issues in Electronic Information Exchange

2-15

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Checkpoint Questions (2 of 2)
3. In general, the properties a message should exhibit are: (Select all that apply) a. Self-describing b. Predictable structure c. Conformance to one, industry-wide standard d. All of the above

Copyright IBM Corporation 2004

Figure 2-15. Checkpoint Questions (2 of 2)

XM3014.1

Notes:

2-16 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit Summary
Having completed this unit, you should be able to: Describe the types of information exchange that occur in modern computer systems Describe the information exchange issues that exist in modern computer systems Describe what is needed to address many of the issues that exist in information exchange

Copyright IBM Corporation 2004

Figure 2-16. Unit Summary

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 2. Issues in Electronic Information Exchange

2-17

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

2-18 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit 3. What Is XML?


What This Unit is About
In this unit, the basic elements of XML are explained.

What You Should Be Able to Do


After completing this unit, you should be able to: Describe the basic rules of XML Identify what makes XML well-formed List the components that make up an XML document Differentiate between XML and HTML Describe the internationalization support in XML Define some best practices for XML

How You Will Check Your Progress


Accountability: Checkpoint Machine exercises

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-1

Student Notebook

Unit Objectives
After completing this unit, you should be able to: Describe the basic rules of XML Describe what it means for an XML document to be well-formed List the components that make up an XML document Differentiate between XML and HTML Describe the internationalization support in XML Define some best practices for XML

Copyright IBM Corporation 2004

Figure 3-1. Unit Objectives

XM3014.1

Notes:
Although XML is a stable and mature, the supporting technologies are evolving rapidly. Keep up with the changes at: http://www.w3.org/TR.

3-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

What Is XML?
At its core XML is text formatted to follow a well-defined set of rules. XML documents consist primarily of tags and text. If you've ever seen the source to an HTML document, then the XML structure should look familiar This text may be stored/represented in: A normal file stored on disk A message being sent over HTTP A character string in a programming language A CLOB (character large object) in a database Any other way textual data can be used XML documents do not need to exist as documents --they may be: Byte streams sent between applications Fields in a database record Collections of XML Infoset information items For simplicity they will be referred to as though they are documents and files.
Copyright IBM Corporation 2004

Figure 3-2. What Is XML?

XM3014.1

Notes:
Usually people will talk about this 'XML' and that 'XML' or this 'XML file' and what they are really referring to is XML markup text encapsulating specific data. As long as XML text or definitions follow the syntax set of rules, any data can be represented.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-3

Student Notebook

Example Tree Representation of XML


XML documents should be thought of as a hierarchical tree structure.

<?xml version="1.0"?> <book> <author> Tom Wolfe </author> <title> The Right Stuff </title> <price> $6.00 </price> </book>

ROOT <book>

<author>

<title>

<price>

"Tom Wolfe"

"The Right Stuff"

"$6.00"

Copyright IBM Corporation 2004

Figure 3-3. Example Tree Representation of XML

XM3014.1

Notes:
This example shows a typical XML document and how it is represented as a tree of nodes. This conceptual depiction of XML is important to understand. book is the root element but ROOT is the highest point in the tree or hierarchy: think of ROOT as the location of a pointer used to keep track of where you are.

3-4

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

A Simple XML Document - Basic Structure


<?xml version="1.0"?> <book> <title> Alphabet from A to Z </title> <isbn number="1112-23-4356" /> <author> <firstName>Boreng</firstName> <lastName>Riter</lastName> </author> <chapter title="Letter A"> The letter A is the first in the alphabet. It is also the first of five vowels. </chapter> <!-- The rest of the letter chapters are missing --> <chapter title="Letter Z"> The letter Z is the last letter in the alphabet. </chapter> </book> "Optional" first line; only required if encoding IS NOT UTF-8 or UTF-16* Root element start tag First child element with data Empty element (no data) Begin element tag Nested child elements End element tag

Element containing an attribute and parsed character data (PCDATA) [TBD]

Comment

Last element in document

Root element end tag

Copyright IBM Corporation 2004

Figure 3-4. A Simple XML Document - Basic Structure

XM3014.1

Notes:
Textual data between tags is also be referred to as content. Tagged elements of any sort are also known as markup. Sometimes the term body is used to refer to anything between a start tag and an end tag.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-5

Student Notebook

A Simple XML Document Basic Nomenclature


The XML instance on the previous page consists of: One main element book Subelements title, isbn, author, chapter, and comment Author contains other subelements firstName and lastName ISBN and chapter contain attributes number and title, respectively Title, firstName, and lastName contain only strings: Elements that contain numbers, strings, dates, and so forth (TBD) but no subelements (or attributes) are said to have simple types ISBN and chapter carry attributes; author has subelements: Elements that contain subelements or carry attributes are said to have complex types Attributes always have simple types (that is, they are numbers, strings, dates, and so forth. TBD -- In a later chapter we describe XML Schemas which have access to a collection of built-in simple types

Copyright IBM Corporation 2004

Figure 3-5. A Simple XML Document - Basic Nomenclature

XM3014.1

Notes:
These definitions will be important when we discuss the XML Schema definition language in a later chapter. We introduce these terms here in preparation for their use then.

3-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Basics of Well-formed XML (1 of 2)


XML documents are considered to be well-formed when they adhere to a set of five rules that define basic XML syntax and structure + a sixth for worldwide conformity. 1. There must be a single root element:
All other elements are nested inside the root element

2. Elements must be properly terminated:


For every opening tag "<...>" there must be a matching closing tag "</...>" The exception is an empty (no content or body) tag "<.../>"

3. Elements must be properly nested underneath a parent tag (except for the single, root element):
A nested tag-pair may not overlap another tag There is no limit to the nesting level of children elements

Copyright IBM Corporation 2004

Figure 3-6. Basics of Well-formed XML (1 of 2)

XM3014.1

Notes:
As you can see, creating an XML instance will be a rather straightforward task.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-7

Student Notebook

Basics of Well-formed XML (2 of 2)


4. Tag names are case sensitive:
All tag and attribute names, attribute values, and data must comply with XML naming rules.

5. Attributes, extra information that can be provided for elements, must be properly quoted:
That is, all attribute values must be in quotes.

6. The first line should/must contain the special tag that identifies the version of the XML specification to apply:
XML 1.0 is currently the most common.

Copyright IBM Corporation 2004

Figure 3-7. Basics of Well-formed XML (2 of 2)

XM3014.1

Notes:
Version 1.1 is about to emerge. Many of the current XML instances lack this declaration. It is often useful to identify the processing instructions, of which the XML declaration is but one, as the prolog; the actual XML instance material, that between the root element open and closing tags, may then be referred to as the XML document.

3-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Element Rules - Rule 1. Single Root Element


All XML documents must have a single root element.

Legal:
<?xml version="1.0"?> <colors> <color>red</color> <color>green</color> </colors>

Not legal:
<?xml version="1.0"?> <color>red</color> <color>green</color>

Colors is the root element for this XML.

Color represents multiple root elements.

Copyright IBM Corporation 2004

Figure 3-8. Element Rules - Rule 1. Single Root Element

XM3014.1

Notes:
XML is a Mark Up language. Tags form the basis of all mark up languages. The purpose of an Element tag is to identify the contents of the data and children tags held within them. The root element should have a name that provides a good definition of all the data contained in the document. The first physical line in this sample is there because of Rule 6, which we shall cover later.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-9

Student Notebook

Element Rules - Rule 2. Element Tag Rules


Elements consist of start and end tags. End tag is identified by the /. Example: <color>red</color> Elements may contain attributes within the start tag. Example: <book isbn="34323"></book> Note: The attribute is isbn. Empty elements contain no child elements or data. These elements can be represented with a special shorthand notation. Example: <record key="123"></record> Can be shortened to: <record key="123" /> (preferred) Or, if the element has no data as: <record />

Copyright IBM Corporation 2004

Figure 3-9. Element Rules - Rule 2. Element Tag Rules

XM3014.1

Notes:
The empty element notation (< ... />) is unique to XML. The W3C is currently updating the SGML recommendation to include this syntax. Empty elements are practical and common when the only associated information is enclosed within the element's attributes. For Empty Element tags, a space is required before the tags terminator (" />").

3-10 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Element Rules - Rule 3. Element Nesting


Elements must be properly nested. The end tags of inner elements must occur before the end tags of outer elements. Any number of child elements or data may be nested within the start and end tags of an element.

Copyright IBM Corporation 2004

Figure 3-10. Element Rules - Rule 3. Element Nesting

XM3014.1

Notes:
There is no limit to the depth of children in XML, but an overly large number may indicate a poor design. If an XML document does not have an associated DTD or Schema, then all whitespace is retained since a processor does not know if it is considered textual data or just for aesthetics. DTDs and Schemas are covered in later sections.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-11

Student Notebook

Element Nesting Example


Legal:
<?xml version="1.0"?> <shirt> <style>Polo</style> <color>red</color> <size>large</size> </shirt>

Not legal:
<?xml version="1.0"?> <shirt> <style> <size>large <color>red Polo </style> </size></color> </shirt> The element tags are mixed up and not ordered.

All elements are properly nested.

Best Practice: Use indentation to represent the document's hierarchy. Important if your document will likely be read by humans. Computers and programs don't usually care.
Copyright IBM Corporation 2004

Figure 3-11. Element Nesting Example

XM3014.1

Notes:
Indentation and other whitespace is only for human readability, but adds "fat" to a documents size and processing requirements. This is only an issue with huge XML documents. It is important to realize that an XML instance is treated by its processor/parser as one, continuous stream of characters, some of which are recognized by the parser as "special." As a consequence, when the parser reports an error its location is where the parser gave up, which may be far beyond where the actual error occurred.

3-12 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Element Rules - Rule 4. XML Naming Rules


XML name construction: The first character must be A-Z, a-z, or _ (underscore) Any number of subsequent letters, numbers, hyphens, periods, colons, and underscore characters. XML names are case sensitive. Names cannot contain spaces. Names must not have a prefix of xml in any case combination (such names are reserved). Best Practice: Brevity in tag names is not necessary. Use descriptive names for elements and attributes. <Queue> or <que> is far better than <q>. Best Practice: Maintain standard naming conventions and quoting. Camelback, dot and underscore notation are all common (For example, camelBackNotation, dot.notation, and underscore_notation).
Copyright IBM Corporation 2004

Figure 3-12. Element Rules - Rule 4. XML Naming Rules

XM3014.1

Notes:
Elements may not use W3C reserved Namespace prefix or the letters "XML" in any case. Element names may not include words reserved by the XML specification. These include: DOCTYPE ELEMENT ATTLIST ENTITY

Colons (":"), while technically legal in tag names, should not be used as they are reserved for use with Namespaces.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-13

Student Notebook

Rule 4. Tag Naming - Samples


Legal
title, book.isdn, lastName, _street, addrLine1, name:first <color> red </color> <SIZE> small </SIZE> <fname> John </fname> <nameXML> John </nameXML>

Not Legal
1name, -street, &name <color> red </COLOR> <SIZE> small <SiZe> <f name> John </f name> <xmlName> John </xmlName>
Copyright IBM Corporation 2004

Comments
Examples of legal and illegal element names. Element names are case sensitive and start and end tags must match.

Element names must not contain spaces. Elements must not contain any W3C reserved words.

Figure 3-13. Rule 4... Tag Naming - Samples

XM3014.1

Notes:

3-14 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Rule 4. Element Content (1 of 2): General


An XML instance is composed of elements expressed in tag pairs (except for empty tags) plus optional attributes that always have quoted values and optional data that appears between the element start tag and the element end tag. Mixed content - element content that contains data (PCDATA is shown) and other elements. Example (snippet): <title><ref>XML</ref> Example</title> <chapter> Chapter information <para>What is XML</para> <para>What is HTML</para> More chapter information </chapter>

Copyright IBM Corporation 2004

Figure 3-14. Rule 4... Element Content (1 of 2): General

XM3014.1

Notes:
PCDATA is parsed character data. A "snippet" is a piece of a larger, legitimate XML file.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-15

Student Notebook

Rule 4. Element Content (2 of 2): Data


Element data content is handled in one of two ways: 1. Parsed Character Data (PCDATA): is examined by the XML parser to discover XML content embedded within it. 2. Character Data (CDATA): is delimited by the special syntax <![CDATA[ ... ]]> and is not processed by the parser.

Copyright IBM Corporation 2004

Figure 3-15. Rule 4... Element Content (2 of 2): Data

XM3014.1

Notes:

3-16 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Rule 4. PCDATA - Parsed Character Data


Predefined entities exist to address ambiguous syntax situations, situations where the literal would be interpreted as part of the XML document syntax rather than its content.

Entity &lt; &gt; &amp; &apos; &quot;

Description "less than" "greater than" "ampersand" "apostrophe" "quote"

Character < > & ' "

Examples: <range>&gt; 6 &amp; &lt; 20</range> <quotes characters="'&quot;'"/>

Copyright IBM Corporation 2004

Figure 3-16. Rule 4... PCDATA - Parsed Character Data

XM3014.1

Notes:
XML differentiates between markup characters and text characters by providing special XML escape characters to be used in XML PCDATA. Only regular parsed character data is allowed inside the attributes value. Any special characters such as ">" and "&" must always be represented as escape characters. The others may appear non-escaped in some places in XML, but it is best to just use the escape characters all the time. These escape characters are independent of the encoding chosen.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-17

Student Notebook

Rule 4. CDATA - Character Data


Syntax: <![CDATA[ ...Anything can go here... ]]> Note: Anything except the literal string "]]>"; to embed "]]>" use "]]&gt;" CDATA is not parsed and is treated as-is. Useful for embedding other languages within the XML. HTML documents. XML documents. JavaScript source. Or any other text with a lot of special characters. Generally speaking the escaping rules inside a CDATA section are those of the embedded language For example, to escape an ampersand in Javascript use &#38;.

Copyright IBM Corporation 2004

Figure 3-17. Rule 4... CDATA - Character Data

XM3014.1

Notes:
The 5 XML escape characters will not be interpreted (that is, changed to the non-escaped character) in CDATA sections, so they should not be used. If you put &lt; in the CDATA, you will see &lt; in the out put not ">". So use the actual characters. Encoding refers to the character set for the entire document, so it does apply to CDATA as well. CDATA sections cannot be nested. CDATA will retain spaces. While XML escape characters are not to used in CDATA, you must be aware of how the 'down-line' applications of the XML will use the CDATA. Common usage: JavaScript in the XML and specialized HTML Browser may have problems with some special characters which must then be represented in hex. example: micro sign () = &#181;
3-18 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

example: non-breaking space = &#160; example: ampersand (7) = &#38; Link to special HTML characters: http://www.owlnet.rice.edu/~jwmitch/iso8859-1.html

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-19

Student Notebook

Rule 4. CDATA Examples


These script elements contain JavaScript:
<script><![CDATA[ function matchwo(a,b) { if (a < b && a < 0) then { return 1 } else { return 0 } } ]]></script> <script><![CDATA[ function matchwo(a,b) { if (a < b &#38;&#38; a < 0) then { return 1 } else { return 0 } } ]]></script>

This nameXML element stores actual XML to be treated as text:


<nameXML> <![CDATA[ <name common="freddy" breed="springer-spaniel"> Sir Frederick of Ledyard's End </name> ]]> </nameXML>
Copyright IBM Corporation 2004

Figure 3-18. Rule 4... CDATA Examples

XM3014.1

Notes:
Both 'script' element examples are valid. Which one you would use would depend on the behavior of the application/browser which will use the transformed XML and therefore the CDATA. This topic is important to XSLT processing.

3-20 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Element Rules - Rule 5. Element Attributes


Attributes are used to attach information to elements. Attributes consist of a name="value" pair, where the name is a legal XML name. This is often referred to as a "key-value" pair. Attributes are placed in the start tag of the element to which they apply. An element may have several attributes, each uniquely named. Examples:
<title type="section" number="1">XML overview</title> <title type="boat" state="FL">Yacht</title>

Notice the different usage of the attribute "type" in the two elements; semantically they are not the same. Attributes must have a value. Values must be quoted with either double or single quotes. Convention is to stick with one or the other.

Copyright IBM Corporation 2004

Figure 3-19. Element Rules - Rule 5. Element Attributes

XM3014.1

Notes:
Attribute naming follows the same rules as element naming. An element may contain zero or more attributes within its start tag. Attributes provide extra information to the meaning of the element. This may include "key" information or other identifying details. Name collisions are common in XML as shown in the attributes of the first example. Using Namespaces resolves these sort of issues. You cannot use the same style quote in the value of the attribute, that is, style="monty's" is valid, style='monty's' is invalid.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-21

Student Notebook

Element Rules - Rule 6. XML Declaration (1 of 2)


The XML Declaration is an optional first line in all XML documents:
<?xml version="1.0" ?> <?xml version="1.0" encoding="UTF-8" ?> <?xml version="1.0" standalone="yes"?>

If this declaration is used, the version attribute is mandatory. The encoding attribute indicates the character encoding used in the document; if UTF-8 or UTF-16 is used it may be omitted. ASCII is a subset of UTF-8 and need not be declared. Comments are not allowed before this statement. The XML Declaration follows the syntax of a Processing Instruction or PI, which is described on a subsequent chart, but it is considered to be unique and is treated separately in the 1.0 XML specification. GENERAL NOTE OF CAUTION: You can not always rely on a browser or tool to completely/correctly enforce the specifications. Nor are the specifications always written in language that, to a particular reader, is unambiguous. Still, the best advice is when in doubt, refer to the specification, which for XML is www.w3.org/XML.
Copyright IBM Corporation 2004

Figure 3-20. Element Rules - Rule 6. XML Declaration (1 of 2)

XM3014.1

Notes:
All XML documents should begin with this tag, and it MUST be at the first position of the file (that is, no blank lines or comments or spaces before the tag). The current version of all XML documents is "1.0" and must appear within the "<?XML" tag if that tag is used. It indicates the version of XML to which the Document Entity must conform. "stand-alone" is included here for completeness: it is automatically set to the correct value if it is not used; most users do not include it. We will have more to say on this in our discussions of the grammars we can apply to XML instances. "Yes" means the document that follows can stand alone; that is, without requiring a grammar document to complete its information.

3-22 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Element Rules - Rule 6. XML Declaration (2 of 2)


The stand-alone attribute is included here for completeness: it is used to indicate if this XML document depends on information declared externally to this document (in a DTD or XSL file (TBD), for examples); value may be yes or no. A value of "yes" indicates there are no external markup declarations; if there are no external markup declarations, the declaration has no meaning. A value of "no"indicates there are or may be such external markup declarations; if there are such declarations but there is no standalone declaration, "no" is assumed. . . . so it is typically not used. In any event, the inclusion in the XML instance of references to external entities, such as those in an embedded DTD, does not change its standalone status. A bigger issue associated with the stand-alone attribute is that of defining or setting values in any entity that may be external to the XML instance. Arguably, the principal reason for using XML is that it explicitly defines the elements it includes. If attribute values are overridden then the XML instance before us is no longer declarative.
Copyright IBM Corporation 2004

Figure 3-21. Element Rules - Rule 6. XML Declaration (2 of 2)

XM3014.1

Notes:
The last point may be problematic if, say, the associated DTD file is not readily available for inspection. You will see in later sections that we can override the attribute values in our XML instance from within a DTD or XML Schema file. This may not appear to be a problem at the outset, but over time we may forget that we are overriding some values. As XML instances grow in length and complexity this may become a serious source of confusion. A best practice is to design the XML instance data to contain ALL the data so that, from an internal data perspective, it does stand alone.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-23

Student Notebook

Comments
<!---> Defines a comment.

A space after the beginning and before the trailing hyphens is recommended but not required. <?xml version="1.0"?> <!-- This is a comment. They can go anywhere inside an XML document except within an element tag. --> <book> <chapter>A is the first letter</chapter> <!-- Here is another comment. --> <chapter>Z is the last letter</chapter> </book> Improper usage: <chapter <!-- comment -->>Some text.</chapter> ...or before the XML Declaration statement.
Copyright IBM Corporation 2004

Figure 3-22. Comments

XM3014.1

Notes:
Comments can go anywhere in the XML except: Before the XML Declaration Inside the actual element tags Comments are a good thing. Use them just as would in a program.

3-24 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Internationalization and Encoding (1 of 2)


Support for different character encodings is provided through the encoding attribute of the XML Declaration. <?xml version="1.0" encoding="charset"?> The encoding attribute indicates the set of characters that are permitted in the document. In the absence of an encoding declaration, Unicode UTF-8 or UTF-16 characters may be used. Documents exchanged via network may be presented to the processor in an encoding format other than the specified encoding as long as the transport protocol (for example, HTTP) indicates the encoding used.

Copyright IBM Corporation 2004

Figure 3-23. Internationalization and Encoding (1 of 2)

XM3014.1

Notes:
A good way to test that the encoding is correct is by viewing the XML file in IE 5.0 or later. There are two error messages you may receive from IE or from a parser: 1. An invalid character was found in text content. You will get this error message if a character in the document does not match the encoding attribute. 2. Switch from current encoding to specified encoding not supported. You will get this error message if there is a disconnect between the encoding used in saving and specification of the encoding. The common problem is that it has been saved as a single-byte encoding and the encoding attribute specifies a double-byte or visa versa.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-25

Student Notebook

Internationalization and Encoding (2 of 2)


It is very important that the editor and operating system used to write and save an XML document support the encoding specified in the XML Declaration. Sample encoding declarations: ASCII (subset of UTF-8) <?xml version="1.0" encoding="ISO-8859-1"?> 16 bit UNICODE <?xml version="1.0" encoding="UTF-16"?> <?xml version="1.0" encoding="ISO-10646-UCS-2"?> ... Japanese <?xml version="1.0" encoding="ISO-2022-JP"?> <?xml version="1.0" encoding="Shift_JIS"?> ... Note: Encoding names are case-insensitive
Copyright IBM Corporation 2004

Figure 3-24. Internationalization and Encoding (2 of 2)

XM3014.1

Notes:
A good way to test that the encoding is correct is by viewing the XML file in IE 5.0. There are two error Messages you may receive from IE or from a parser: 1. An invalid character was found in text content. You will get this error message if a character in the document does not match the encoding attribute. 2. Switch from current encoding to specified encoding not supported. You will get this error message if your file there is a disconnect between the saving and specification of the encoding. The common problem is that is has been saved as a single-byte encoding and the encoding attribute specifies a double-byte or visa versa.

3-26 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Processing Instruction
Syntax <? target arg*?> Processing Instruction is often abbreviated as PI in documentation. A feature inherited from SGML. Used to embed application-specific instructions in documents. The target name immediately follows "<?" and is used to associate the PI with an application. May include zero or more arguments. May be preceded by comments.

For example, <?xml-stylesheet href="common.css" type="text/css"?>, which is a generally available stylesheet for simple formatting.

Copyright IBM Corporation 2004

Figure 3-25. Processing Instruction

XM3014.1

Notes:
If a comment is inserted between the XML Declaration and a PI such as the one shown, Studio will not consider it an error. A demo file is available in the XM301 Lectures folder, Unit 3. This PI, although useful, does NOT define a grammar for the XML document in which it is used: we will talk about grammars in subsequent chapters. To reemphasize: the XML Declaration, while it may look like a PI, is treated as special!

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-27

Student Notebook

Well-formed versus Valid


A well-formed XML document: Consists of XML elements that are nested within another. Has a unique root element. Follows the XML naming conventions. Follows the XML rules for quoting attributes. Has tags that are properly terminated. All XML parsers check for well-formedness. A valid XML document has an associated vocabulary and obeys the structural rules specified by that vocabulary. Associated vocabulary is typically defined by either a DTD or an XML Schema. XML parsers may be validating or non-validating depending upon whether or not they can apply an associated grammar. Studio is an example of a tool whose XML capabilities include validation.

Copyright IBM Corporation 2004

Figure 3-26. Well-formed versus Valid

XM3014.1

Notes:
All XML parsers must check XML documents for being well formed. XML parsers are classified as being validating, or non-validating.

3-28 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

HTML versus XML (1 of 2)

HTML is about presentation and browsing

XML is about structured information interchange

<course> <name>Java Programming</name> <department>EECS</department> <teacher> <name>Paul Thompson</name> </teacher> <student> <name>Ron Jones</name> </student> <student> <name>Uma Abingdon</name> </student> <student> <name>Lindsay Garmon</name> </student> </course>

Copyright IBM Corporation 2004

Figure 3-27. HTML versus XML (1 of 2)

XM3014.1

Notes:
All markup tags in HTML are directed at visual composition. No consideration is given to the actual semantics of the data. XML markup tags are based solely on the data content. Clean separation of data and presentation

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-29

Student Notebook

HTML versus XML (2 of 2)


HTML
<html> <title>Course Roster</title> <body> <center> <h1>Course Roster</h1> <h2>XML Programming</h2> <h3>Department: EECS</h3> <p> <table border=2> <tr> <th>Teacher</th> <td>Paul Thompson</td> </tr><tr> <th>Student<br>List</th> <td>Ron Jones<br> Uma Abingdon<br> Lindsay Garmon </td> </tr> </table> </center> </body> </html>

XML
<?xml version="1.0"?> <course> <name>Java Programming</name> <department>EECS</department> <teacher> <name>Paul Thompson</name> </teacher> <student> <name>Ron Jones</name> </student> <student> <name>Uma Abingdon</name> </student> <student> <name>Lindsay Garmon</name> </student> </course>

Copyright IBM Corporation 2004

Figure 3-28. HTML versus XML (2 of 2)

XM3014.1

Notes:
These two source listings really show fundamental differences between HTML and XML. While both contain text marked up by tags, their meaning is entirely different. Which would you rather parse and insert into a database?

3-30 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

HTML and XML Key Differences


HTML
Predefined tags define how to present data. Allows missing end tags. <br> and <p> Attributes do not require quotes.
<img src=myDog.jpeg>

XML
Defines its own tags to identify data. Requires matching end tags.
<name>test</name>

Attributes must be quoted.


<book isdn="3432"></book>

Attributes do not require a value.


<input type=radio checked>

Attributes must have a value.


<device type="radio" />

Tolerates non-nested tags.


<H1><center>Hello!</H1></center>

Strict nesting and tag matching rules.


<H1><center>Hello!</center></H1>

Browsers will almost always do a "best guess" on ill-formed HTML. Does not support empty elements, but allows single start tags. <br> and <hr> Is not case sensitive.
<TABLE> ... </table>

XML Parsers will generate a fatal exception for well-formedness violations. Provides for empty elements.
<device type="radio" />

is valid

Is case sensitive.

Copyright IBM Corporation 2004

Figure 3-29. HTML and XML Key Differences

XM3014.1

Notes:
HTML has a fixed tag set. In XML there is no predefined tag set. The allowed tags in an XML document are defined in its DTD or Schema. XHTML is an effort to correct the sins of HTML's past. It is a new XML technology that consists of an HTML specific DTD that defines the valid HTML tags. Unfortunately, many of today's browsers will not recognize XHTML documents properly!

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-31

Student Notebook

Checkpoint Questions (1 of 3)
1. Basic XML can be described as: A. A hierarchical structure of tagged elements, attributes and text. B. All the HTML tags plus a set of new XML only tags. C. Object-oriented structure of rows and columns. D. Processing instructions (PIs) for text data. E. Textual data with tags for visual presentation. 2. Which of these XML fragments is not well-formed? A. <root><class>XML</class></root> B. <class><root>XML</root></class> C. <root><class id="XML"></root> D. <root>XML<class id="XML"/>XML</root> E. <root class="XML"><class id="root"/>XML</root>

Copyright IBM Corporation 2004

Figure 3-30. Checkpoint Questions (1 of 3)

XM3014.1

Notes:

3-32 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Checkpoint Questions (2 of 3)
3. XML Comments are allowed (Select all that apply): A. Before the XML Declaration B. Anywhere C. Between element tags D. Before the root element E. All of the Above 4. Which of these XML elements with attributes is not well-formed? A. <name first='Tony' LAST="Romeo" /> B. <name name="Tony" NAME="ROMEO" /> C. <_name_ first-name="Tony" last-name="Romeo"/> D. <name="Tony Romeo" /> E. <name name="first='Tony' last='Romeo'" /> F. All of the Above

Copyright IBM Corporation 2004

Figure 3-31. Checkpoint Questions (2 of 3)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-33

Student Notebook

Checkpoint Questions (3 of 3)
5. Which of these comments regarding HTML and XML is not true? A. HTML markup is focused on presentation. B. XML markup is based on defining the data. C. XML is based on HTML. D. HTML tags are not case sensitive. E. XML tags are case sensitive. F. Both XML and HTML support attributes.

Copyright IBM Corporation 2004

Figure 3-32. Checkpoint Questions (3 of 3)

XM3014.1

Notes:

3-34 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit Summary
Having completed this unit, you should be able to: Describe the basic rules of XML Describe what it means for an XML document to be well-formed List the components that make up an XML document Describe the differences between XML and HTML Describe the internationalization support in XML Describe some best practices in XML

Copyright IBM Corporation 2004

Figure 3-33. Unit Summary

XM3014.1

Notes:
The status of various XML technologies (W3C Activities) can be found at: http://www.w3.org/TR.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 3. What Is XML?

3-35

Student Notebook

3-36 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit 4. WebSphere Studio Application Developer Overview


What This Unit is About
This unit describes IBM WebSphere Studio Application Developer. This is an overview of the broad features and organization of this application development tool.

What You Should Be Able to Do


After completing this unit, you should be able to: Describe role-based development Describe the WebSphere Studio family of tools State the role of WebSphere Studio Workbench in the WebSphere Studio tools Describe basic features of WebSphere Studio Application Developer

How You Will Check Your Progress


Accountability: Review

References
WebSphere Studio Application Developer Help Perspective

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Unit Objectives
After completing this unit, you should be able to: Describe role-based development Describe the WebSphere Studio family of tools State the role of WebSphere Studio Workbench in the WebSphere Studio tools Describe basic features of WebSphere Studio Application Developer Describe the major sets of tooling provided by WebSphere Studio Application Developer

Copyright IBM Corporation 2004

Figure 4-1. Unit Objectives

XM3014.1

Notes:

4-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Roles-based Development
Developing Web Applications requires more than just writing Java code

Role
Enterprise Integrator
Connection Data

Bean Provider
Business Logic Data

Application Assembler
Application Flow

Page Producer
Page Layout and Content

Web Master
Operational Environment

Workarea

Products

JavaBeans EJBs

JavaBeans EJBs

Servlets, JSPs, JavaBeans

HTML, JSPs, MIME Types

Configuration Data, Site Usage Metrics

Tool

WebSphere Studio Tooling One tool, many user perspectives

Copyright IBM Corporation 2004

Figure 4-2. Roles-based Development

XM3014.1

Notes:
There are four distinct development roles shown here: Enterprise Integrator Bean Provider Application Assembler Page Producer Tooling needs to support each of these roles and permit easy management and integration of the developed assets.

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Development Environment Goals


Create a new Development Environment that will: Be based on a new open, highly pluggable platform
Unified by a new tooling platform Provide multilevel vendor integration

Provide a role-based development model where the assets are the focus, not the tool Provide a common repository solution for all assets and tools Provide rapid support for new standards and technologies
For example, Web Services

Copyright IBM Corporation 2004

Figure 4-3. Development Environment Goals

XM3014.1

Notes:
The development environment should support the tasks performed by the developers. It should be configurable and customizable for each individual developer. Tools need to accommodate the rapid change in available technologies.

4-4

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

IBM WebSphere Studio Family


Provide a sturdy Web/Java development platform in the industry Open tooling and run-time support Open programming model Provide in-depth Enterprise connectivity EJB/J2EE Tooling Enterprise Connectivity/Enterprise Access Builders Provide integrated end-to-end development Built-in Unit Test Environment Incremental compilation Flexible debugging support Provide a Team Development solution Integrated version control

Copyright IBM Corporation 2004

Figure 4-4. IBM WebSphere Studio Family

XM3014.1

Notes:
The IBM WebSphere Studio family is applied to a development platform (as opposed to a set of development tools).

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Family Contents
WebSphere Studio Products (V5) : WebSphere Studio Application Developer (includes all of Site Developer functionality
Focused on development of Web Services, JSPs, Servlets, XML and J2EE and database applications in a team environment

WebSphere Studio Enterprise Developer


Includes all of Application Developer functionality Focused on Enterprise Integration using the J2EE Connector Architecture Supports integrated development of z/OS based CICS, IMS and batch applications

Copyright IBM Corporation 2004

Figure 4-5. Family Contents

XM3014.1

Notes:
The flagship products in the WebSphere Studio brand (Version 5) are: WebSphere Studio Application Developer WebSphere Studio Enterprise Developer

4-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

WebSphere Studio Workbench


Workbench is: Not a tool, not a product, not for sale A portable, universal tool platform and integration technology The basis for an open source project Workbench has: Frameworks and services that enable tool builders to focus on tooling building Tools to help tool builders build tools
Java Development Tools (JDT) Plug-in Development Environment (PDE)

Copyright IBM Corporation 2004

Figure 4-6. WebSphere Studio Workbench

XM3014.1

Notes:
The Workbench is not a tool, that is, it is not in itself a product that is for sale. It is an open and portable tool platform providing an integration technology. The Workbench can be thought of as a set of Java frameworks and a set of development tools geared for tool builders.

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-7

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

WebSphere Studio Workbench Rationale


End-users (Web application developers) No more on-site integration, tools just work together Common, easy-to-use interface Common code, project, file management system Same tool platform regardless of development role Same look and feel regardless of tool vendor Tool Builders Seamless integration and interoperability with IBM AD tools and WebSphere Software Platform Seamless integration with other Workbench tools Enterprise ready, off the shelf
Globalization, distributed debug, Team, SCM

Easy construction and deployment platform for tools Open access to source code and tool provider community

Copyright IBM Corporation 2004

Figure 4-7. WebSphere Studio Workbench Rationale

XM3014.1

Notes:
The Workbench offers its greatest support for tool builders; making it easy to add plug-ins (tools) to the overall IDE. This allows quick "time-to-market" of tools supporting emerging technologies. The underlying framework which adds to the tool builders productivity gives end-users a common look and feel.

4-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

WebSphere Studio Application Developer

Start the WebSphere Studio Application Developer Start -> Programs -> IBM WebSphere Studio -> Application Developer 5.1 Workbench opens when you launch Application Developer Within the workbench -- open the perspectives, views, and editors
Copyright IBM Corporation 2004

Figure 4-8. WebSphere Studio Application Developer

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-9

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Terminology
Shortcut Bar
Editor

Navigator Pane Source Pane

Outline Pane

Task Sheet

Views

Copyright IBM Corporation 2004

Figure 4-9. Terminology

XM3014.1

Notes:
The workbench window displays one or more perspectives that contain views and editors. You can quickly switch between perspectives and views using the shortcut buttons which appear on the shortcut bar.

4-10 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Perspectives
A group of related views and editors To open a Perspective: Select via Window -> Open Perspective Some Perspectives: Java: to develop and test Java programs Server: to configure, run, and manage test servers Debug: to control debug flow, see variables, and so forth

Copyright IBM Corporation 2004

Figure 4-10. Perspectives

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-11

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Views
A view displays specialized information. For example: Bookmarks view displays all bookmarks in workbench. A view might appear alone in a single pane, or several views might be stacked within a single tabbed pane. Views can be undocked/docked from the main workbench window. Information updates on a view are saved immediately. View toolbars apply only to the particular view in which they appear.

Copyright IBM Corporation 2004

Figure 4-11. Views

XM3014.1

Notes:
Views support editors and provide alternative presentations or navigation of the information in your workbench. For example, the Navigator displays projects and other resources you are working with. A view might appear by itself, or stacked with other views in a tabbed notebook. On Windows platforms, views can be undocked from the main workbench window and appear as floating windows on the desktop. Undocked views can also be docked back into the main workbench window. More info on the Application Developer menu: Help --> Navigating Workbench

4-12 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Editors
An editor is used to edit or browse a resource. Modifications made in the editor follow an open-save-close life cycle. An editor can contribute to the Workbench menu bar. Examples: Java Source Editor Web Deployment Descriptor Editor Web Site Configuration Editor JSP Editor WSDL Editor

Copyright IBM Corporation 2004

Figure 4-12. Editors

XM3014.1

Notes:
The key thing to note about editors is the Open-save-close life cycle. You must explicitly save the corresponding resource after making changes.

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-13

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Online Help
To learn more on Workbench, select Help ->Help Contents) Select Application Developer information Select Getting Started Select Workbench Fundamentals and Tutorial: Workbench Basics

Copyright IBM Corporation 2004

Figure 4-13. Online Help

XM3014.1

Notes:
Tips : F1, F1 : info pop on a selected task To hide the navigation frame, click the Hide Navigation button on the Help view's toolbar. Note: Your product may include more than one information set (a collection of documentation topics). When you run a search, only the current information set is searched. The current information set is shown in the drop-down list at the top of the Help view. To search another information set, select it from the list, and run the search again.

4-14 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Cheat Sheets
Guide developer through an application development process Sequence of documented steps with relevant documentation Displayed in workbench pane Task-related tools are automatically launched or have launch icons in cheat sheet Launched via Help Cheat Sheets

Copyright IBM Corporation 2004

Figure 4-14. Cheat Sheets

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-15

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Application Developer Design Points


Performance Customizable Perspectives Promote role-based development (Web Developer, Java Developer, DBA, and so forth) Reduces the learning curve Perspectives use same project artifacts regardless of perspective being used Pluggable development environment Java and ActiveX plug-in support IBM and ISVs use same plug-in architecture to extend the Workbench Support for automated builds Apache.org "Ant" support Command-line EJB generation

Copyright IBM Corporation 2004

Figure 4-15. Application Developer Design Points

XM3014.1

Notes:
Reduced learning curve through the consolidation of tooling to one platform. For example, with customizable perspectives, one could customize Application Developer to look similar to other Java IDEs.

4-16 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Tooling
Java IDE J2EE Tooling Portlet Tooling Data Tooling Web Tooling XML Tooling Performance / Trace Tooling Team Development Tooling Web Services Tooling

Copyright IBM Corporation 2004

Figure 4-16. Tooling

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-17

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Java IDE (1 of 3)
Ships with SDK 1.3 Pluggable JRE Support Defined at project and workbench level Hot Method Replace Dynamically replace Java classes during debug Enabled when Application Server V5 runs in debug mode Java Snippet Support (Scrapbook) Task Sheet (All Problems Page) Code Assist Refactoring Support Rename/move support for method/class/package Fix all dependencies for renamed element With and without preview

Copyright IBM Corporation 2004

Figure 4-17. Java IDE (1 of 3)

XM3014.1

Notes:
A default JRE can be selected for the Workbench with Windows-> Preferences. Project specific JRE is selected in the Launch Configuration Dialog. For more on hot method replace, refer to the foil at the end of the unit.

4-18 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Java IDE (2 of 3)
Faster IDE Smart Compilation No lengthy compile/build/run steps Pluggable Framework, in-placetool launching Running class/code with errors Precise reference searching Text and Java-based JDI-based debugger for local/remote debugging Run code with errors Multiple test environments can be configured J2EE WAR/EAR Deployment

Copyright IBM Corporation 2004

Figure 4-18. Java IDE (2 of 3)

XM3014.1

Notes:
JDI: Java Debugging Interface. The JDI is a high-level Java API providing information useful for debuggers and similar systems needing access to the running state of a Java virtual machine.

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-19

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Java IDE (3 of 3)
UML Class Diagram Editing and Visualization Support for Java classes and EJB components Diagrams generated from existing classes/components New diagrams built and used to develop corresponding component Typical Class Diagram Editor operations: Create classes, packages, and interfaces Create extends and implements relationships Create methods and fields Refactor components Add EJB relationships Add EJBQL queries Add CMP fields to a primary key UML Class Diagrams can be exported

Copyright IBM Corporation 2004

Figure 4-19. Java IDE (3 of 3)

XM3014.1

Notes:
Starting with V5.1, Application developer adds support for UML visualization. You can select an existing components and have the system generate the UML diagrams, or you can start with a blank diagram and develop components from the diagram, or use a combination of the two approaches. These features let developers understand existing components better by producing UML that represents the existing components and also assists them in generating components based on the UML diagrams. The entire class diagram or portions may be exported in bmp, jpg, or gif image formats.

4-20 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

J2EE Tooling (1 of 2)
J2EE 1.3 EJB 2.0 Support Servlet 2.3, JSP 1.2 Support J2EE Perspective provides views and editors for EJB/Servlet/JSP Developer Object-relational Mapping for EJBs Top-down/Bottom-up/Meet-in-the-middle All metadata exposed as XMI No hidden metadata EAR and WEB Deployment Descriptor Editors Forms-based (no need to directly edit XML) Source view also available Struts Support Web Diagram visual editor for application design

Copyright IBM Corporation 2004

Figure 4-20. J2EE Tooling (1 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-21

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

J2EE Tooling (2 of 2)
Connector Projects J2EE Connector Architecture (JCA) based EJB Test Client Universal Test Client HTML-based J2EE programming model Built-in JNDI registry Browser Unit Test Environment for J2EE WebSphere Application Server V4 or V5 and Apache Tomcat Create multiple projects with different Server configurations/instances
Allows for versioning of unit test environment Share Unit Test Environment Configuration across developer

Copyright IBM Corporation 2004

Figure 4-21. J2EE Tooling (2 of 2)

XM3014.1

Notes:
WebSphere Studio provides a Web-based Universal Test Client where you can test your Enterprise JavaBeans (EJBs) and other objects. Using this test client, you can test the home and remote interface methods of your enterprise beans. By calling the methods and passing user-defined arguments you can test methods to ensure that they work correctly.

4-22 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Portlet Tooling
Wizards to create Portlet Application Management of Deployment Descriptors web.xml portlet.xml Multiple portlets per application Integrated development and test environment Full use of debugger Test on remote server or integrated unit test environment Export deployable WAR file

Copyright IBM Corporation 2004

Figure 4-22. Portlet Tooling

XM3014.1

Notes:
There are actually two related plug-ins. The first, WebSphere Portal Toolkit ships with all offerings of WebSphere Portal V4.x. The second, WebSphere Everyplace Toolkit ships with WebSphere Everyplace Server. The test environment interacts with a developer configuration of WebSphere Portal Server running on WebSphere Application Advanced Single Server Edition (AEs). This is facilitated by the Remote WebSphere Server configuration.

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-23

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Data Tooling (1 of 2)
Data Perspective Provides views geared for DBAs to:
Create Databases Create Tables/Views/Indexes/Keys Generate DDL Connect to and view existing relational database objects

Online and off-line support for working with databases


Metadata generated as XMI

SQL Query Builder and SQL Wizards


Visually construct SQL statements
SELECT, INSERT, UPDATE, DELETE supported

Metadata generated as XMI SQL/XML mapping

Copyright IBM Corporation 2004

Figure 4-23. Data Tooling (1 of 2)

XM3014.1

Notes:

4-24 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Data Tooling (2 of 2)
DB2 Stored Procedures Create / Build and Register/ Debug / Drop a stored procedure or User Defined Function (UDF) SQL or Java-based SQLJ Files Create / Build / Debug SQLJ Workbench runs SQLJ translator and builds Java files

Copyright IBM Corporation 2004

Figure 4-24. Data Tooling (2 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-25

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Web Tooling (1 of 2)
Web Site Designer Provide site-level views of Web project Graphical and detail tabular views of site structure Page Designer Provides page-level view of Web project components HTML and JSP editing WYSIWYG page design, source editing and page preview Choice of static or dynamic Web project Appropriate tool support loaded at project creation time Palette View Provides drawers of useful items for HTML and JSP creation Items are dropped and dragged onto page editor

Copyright IBM Corporation 2004

Figure 4-25. Web Tooling (1 of 2)

XM3014.1

Notes:
The Web Site Designer is new with 5.1. The configuration of the entire Web site is maintained in the Web Site Configuration object. The choice of static or dynamic web sites and the Palette view are also newly introduced in release 5.1. Examples of the drawer labels in the Palette view are: HTML, Free Layout, JSP, Java Server pages, and Site Parts. The Site Parts include items such as Vertical and Horizontal Navigation Bars, which help to maintain consistency in the look and feel of pages across the site.

4-26 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Web Tooling (2 of 2)
Multiple markup types (WML, cHTML) and pervasive device support Built in Servlet, Database, and JavaBean Wizards Built-in JSP Debugging Site Style Sheet and Page Template Support Links View View HTML/JSP and all links reference in page Parsing and link management updates link when resources are renamed or moved Jakarta JSP Taglibs Specify in project Properties or NewProject to include Available: Standard Tag Library (JSTL), accessing JSP objects, database access, internationalization, utilities

Copyright IBM Corporation 2004

Figure 4-26. Web Tooling (2 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-27

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XML Tooling (1 of 3)
XML Tooling provides integrated tools/perspectives to create XML based components: XML Source Editor
DTD/Schema validation Code Assist for building XML documents

DTD Editor
Visual tooling for working with DTDs Create DTDs from existing documents Generate an XML Schema from a DTD Generate JavaBeans for creating/manipulating XML documents Generate an HTML form from a DTD

XML Schema Editor


Visual tooling for working with an XML Schema

Copyright IBM Corporation 2004

Figure 4-27. XML Tooling (1 of 3)

XM3014.1

Notes:

4-28 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XML Tooling (2 of 3)
XSL Editor Edit/create and validate XSL XSL Debug and Transformation Tool Trace XSL transformation Examine relationships between the result node, the template rule, and the source node XML to/from Relational Databases Generate XML, XSL, XSD from an SQL Query RDB/XML Mapping Editor Map columns in a table to elements and attributes in an XML document Generate a Database Access Definition (DAD) script to compose/decompose XML documents to/from a database DAD is used with DB2 XML Extender

Copyright IBM Corporation 2004

Figure 4-28. XML Tooling (2 of 3)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-29

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XML Tooling (3 of 3)
XPath Expressions Wizard Create XPath expressions XML to XML Mapping Editor Map one on more source XML files to a single target

Copyright IBM Corporation 2004

Figure 4-29. XML Tooling (3 of 3)

XM3014.1

Notes:
XPath expressions can be used to search through XML documents, extracting information from the nodes (such as an element or attribute).

4-30 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Performance/Trace Tooling
Built-in tooling helps developer isolate and fix performance problems with their Web application Profiling and Logging Perspective allows developers to: Attach to local/remote agents for capturing performance data JVM Monitoring
Heap Stack Class/Method details Object References

Resource Monitors
Execution patterns CPU usage Disk usage

Copyright IBM Corporation 2004

Figure 4-30. Performance/Trace Tooling

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-31

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Team Development
Workbench integration occurs through a pluggable, adapter-based design: A published framework API allows any SCM provider to add an adapter to integrate their SCM into the Workbench Application Developer ships with CVS Plugin ClearCase LT Plugin

Copyright IBM Corporation 2004

Figure 4-31. Team Development

XM3014.1

Notes:

4-32 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Web Services Tooling (1 of 2)


Tools to Construct Web Services: Discover
Browse UDDI registry to locate Web Service (Web Services Explorer) Generate JavaBean proxy for existing Web Services

Create / Transform
Create new Web Services from JavaBeans, databases

Build
Wrap existing artifacts such as SOAP and HTTP GET/POST accessible services Generate Java client proxy to Web Services

Maintain Web Services Description Language (WSDL) files (WSDL Editor)


Create new WSDL files Create ports, port types, messages, bindings, operations, types within WSDL files Validate new and existing WSDL files

Copyright IBM Corporation 2004

Figure 4-32. Web Services Tooling (1 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-33

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Web Services Tooling (2 of 2)


Tools to Construct Web Services: Deploy
Deploy Web Services to WebSphere or Tomcat Servers

Test
Built-in test client allows for immediate testing of local and remote Web Services

Publish
Publish Web Services to a UDDI Registry

Copyright IBM Corporation 2004

Figure 4-33. Web Services Tooling (2 of 2)

XM3014.1

Notes:

4-34 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Standards Support
EJB 2.0 J2EE 1.2 and 1.3 Servlet 2.3 JSP 1.2 JRE 1.3 Web Services Definition Language (WSDL) 1.1 Web Servers Interoperability (WS-I) Basic Profile 1.0 Apache SOAP 2.3 XML DTD 1.0 10/2000 Revision XML Namespaces 1/99 Version XML Schema 5/2001 Version HTML 4.01 (other levels should work) CSS2 (PageDesigner displays a subset)
Copyright IBM Corporation 2004

Figure 4-34. Standards Support

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-35

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Review
Name some of the roles in Web application development. What is the name of the Application Developer perspective you would usually use for EJB development? Compare and contrast View, Editor, and Perspective. Name the SCM tools that ship with Application Developer.

Copyright IBM Corporation 2004

Figure 4-35. Review

XM3014.1

Notes:

4-36 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit Summary
Having completed this unit, you should be able to see: The concept of Role-Based Development The WebSphere Studio Family The WebSphere Studio Workbench in the context of WebSphere Studio products Basic features of WebSphere Studio Application Developer Major tooling sets provided by WebSphere Studio Application Developer

Copyright IBM Corporation 2004

Figure 4-36. Unit Summary

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 4. WebSphere Studio Application Developer Overview

4-37

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

4-38 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit 5. Document Type Definition (DTD)


What This Unit is About
This unit covers XML 1.0 DTDs, which provide a way to define the structure of an XML document. DTDs provide an additional level of syntactic checking.

What You Should Be Able to Do


After completing this unit, you should be able to: Describe the reasons for using DTDs Define well-formed versus valid documents Define the grammar rules for an XML document using DTD Describe the difference between non-validating and validating processors Describe examples of DTDs being used in business Describe best practices used in DTDs Define the limitations of DTDs Describe the status of the DTD in the industry

How You Will Check Your Progress


Accountability: Checkpoint Machine exercises

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Unit Objectives
After completing this unit, you should be able to: Describe the reasons for using DTDs Define well-formed versus valid documents Define the grammar rules for an XML document using DTDs Describe the difference between non-validating and validating processors Describe examples of DTDs being used in business Describe best practices used in DTDs Define the limitations of DTDs Describe the status of the DTD in the industry

Copyright IBM Corporation 2004

Figure 5-1. Unit Objectives

XM3014.1

Notes:

5-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Review: Well-Formed XML


Has the optional first line; required if encoding is not UTF-8 or UTF-16.
<?xml version="1.0"?>

Matching start and end element tags with correct syntax.


<tag>data</tag>

Defines attributes within start tag and quotes correctly.


<tag attribute="x">data</tag>

Correct nesting of elements.


<employee> <name>John Smith</name> <id>X04913</id> </employee>

...and Single Root and XML naming constraints


These are simple constraints on the structure of an XML document.
Copyright IBM Corporation 2004

Figure 5-2. Review: Well-Formed XML

XM3014.1

Notes:
This is a quick review of the important rules for XML well-formedness. It's important to recognize that the well formedness rules are very simple.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Why Do We Need DTDs?


What if we want some additional constraints:
<message urgent="yes"> <greeting>hi</greeting> <farewell>bye</farewell> </message>

Can only have two specific children (greeting, farewell). The greeting child must precede the farewell child. Message may have an optional urgent attribute. What if we want to define and publish the structure an XML document is to conform to? What if we want the computer to be able to verify that an XML document meets these kinds of constraints? What if we want to have reusable pieces of text between two XML documents?

Copyright IBM Corporation 2004

Figure 5-3. Why Do We Need DTDs?

XM3014.1

Notes:
The difficulty with well-formedness is that the rules are very simple. Quite often we want to express more complicated constraints such as: The element <message> can only have two children, <greeting> and <farewell>, and the two children must appear in that order The element <message> may have an optional urgent attribute? What if we want the computer to be able to verify that an XML document meets these kinds of constraints? What if we want to have reusable pieces of text between two XML documents?

5-4

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

What Is a DTD?
Blueprint of a document's structure. Contains a series of declarations. DTDs Can be a separate file from the XML document. Can be embedded within the XML file. Can be split between a separate file and the XML file. DTDs define: The elements that can or must appear. How often the elements can appear. How the elements can be nested. Allowable, required and default attributes. But note: the use of DTDs is optional. An XML document that obeys the rules in a DTD is said to be valid.
Copyright IBM Corporation 2004

Figure 5-4. What Is a DTD?

XM3014.1

Notes:
A Document Type Definition is essentially the framework or skeleton of an XML document. It defines which elements are allowed, which attributes are allowed for each element, and whether such elements or attributes are required or optional. XML Schemas (often referred to as Schemas) extend the functionality of the DTD by adding data typing and other enhancements. An XML document that conforms to its specified DTD or XML Schema is said to be valid. The DTD can be a separate file or it can also be embedded in the XML file. In fact, the DTD contents can be split across an external file and the XML file.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

What Is Allowed in a DTD? (1 of 2)


Element type declaration <!ELEMENT . . .> A syntax for formally describing what an element type is and what type of data it can contain. Its basic format is: <!ELEMENT name (content-model)>, where name is the element-type and (content-model) is the type of data the element can contain. Many pages follow to more fully explain "content." Attribute list <!ATTLIST . . .> A list of attributes for an element. Attribute lists enable you to group together all related attributes for an element. All elements must have their attributes listed in an attribute list. Attribute declaration - a syntax for formally declaring what an attribute type is and what type of data it can contain. Its basic format is: <attribute name attribute-type default>, where attribute-type is the type of data the attribute can contain and default is the default value of the attribute. The syntax each attribute in the attribute list must follow.

Copyright IBM Corporation 2004

Figure 5-5. What is Allowed in a DTD? (1 of 2)

XM3014.1

Notes:
Similar material can also be found in the WSAD IE 5.1 help file for DTD. This page and the next list the elements you may use in a DTD file.

5-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

What Is Allowed in a DTD? (2 of 2)


Entity <!ENTITY . . .> A shortcut used to represent complex strings or symbols that would otherwise be impossible, difficult or repetitive to include by hand. There are built-in or predefined ENTITYs, too. Notation <!NOTATION . . .> A means of associating a binary description, typically stored external to the DTD or XML file, with an entity or attribute. For example: to include an image such as a GIF or JPEG image. Comments: No change: <!-- whatever is legal --> A note that is visible to the DTD's author, but is ignored by the XML parser.

Copyright IBM Corporation 2004

Figure 5-6. What is Allowed in a DTD? (2 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-7

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XML and DTD Example


<?xml version='1.0'?> <address> <name> <title>Mrs.</title> <first-name>Mary</first-name> <last-name>McGoon</last-name> </name> <street>1401 Main Street</street> <city>Sheboygan</city> <state>WI</state> <zip>38472</zip> <country>USA</country> </address> <!ELEMENT address (name, street+, city, state, zip?, country)> <!ELEMENT name (title?, first-name, last-name)> <!ELEMENT title (#PCDATA)> <!ELEMENT first-name (#PCDATA)> <!ELEMENT last-name (#PCDATA)> <!ELEMENT street (#PCDATA)> <!ELEMENT city (#PCDATA)> <!ELEMENT state (#PCDATA)> <!ELEMENT zip (#PCDATA)> <!ELEMENT country (#PCDATA)>

Copyright IBM Corporation 2004

Figure 5-7. XML and DTD Example

XM3014.1

Notes:
Here's a simple example of an XML document on the left, and the DTD rules that describe it on the right. We're not going to go into the details of the rules right here -- that's what the rest of this unit is about. We just wanted you to have an idea of how an XML file and it's related DTD might look.

5-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

What Is allowed. . .Declaring Elements


Syntax: <!ELEMENT elementName (contentModel)> An element declaration in the DTD, and the corresponding element in the XML document: Declaration (DTD): <!ELEMENT greeting (#PCDATA)> Corresponding valid XML fragments: <greeting>Hello, World!</greeting> <greeting> <![CDATA[G'day!]]> </greeting>
Copyright IBM Corporation 2004

Figure 5-8. What Is Allowed. . .Declaring Elements

XM3014.1

Notes:
Here's our introduction to declaring elements. An element declaration begins with <!ELEMENT followed by the name of the element being declared and then the content model for the element. Here's a sample declaration for an element called greeting that accepts #PCDATA (text), along with two <greeting> elements that are valid according to this declaration. The second <greeting> element is using a CDATA section to quote its contents. Remember, element names must start with a letter or underscore, however, the letters xml, xsl, xsi and xsd are reserved (regardless of case) by the W3C; future development may reserve other "x--" prefixes. The colon character is also reserved (see Unit 5. Namespaces), a period or alphanumeric characters may follow the first character (while technically legal, an underscore-period combination is not recommended). #PCDATA (parsed character data) indicates that only text and entities can be included in the element. This data will be examined by the parser for entities and markup. Parsed character data cannot contain the characters "&", "<", or ">"; these need to be represented by their respective entities (Refer to the slide Built-in Entities).
Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-9

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Element Content Models


The content of an element is described by a content specification. Types of DTD Content Models EMPTY ANY Elements Mixed Text content (#PCDATA)

Copyright IBM Corporation 2004

Figure 5-9. Element Content Models

XM3014.1

Notes:
The content is the stuff in between the element's start and end tag. There are four types of content models in XML 1.0 DTDs. Types of DTD Content models EMPTY ANY Element only - this includes child elements Mixed - this includes child elements and text

5-10 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

EMPTY Content Model


Element is to have no data. It may have attributes. Declaration (DTD): <!ELEMENT placeholder EMPTY> Valid XML: <placeholder></placeholder> or <placeholder/> (preferred convention)

Copyright IBM Corporation 2004

Figure 5-10. EMPTY Content Model

XM3014.1

Notes:
The EMPTY content model is used for an element that will have no content whatsoever. Note that such an element may have as many attributes as it likes. To specify the EMPTY content model, provide the word EMPTY for the content model. The two examples on the foil show two elements that are valid with an empty content model. Empty elements are not much use unless they have attributes. We'll learn more about declaring attributes in a bit. An EMPTY element can be very useful for testing snippets of XML. There is an example of this later in this chapter.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-11

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

ANY Content Model


Can contain ANY data or well-formed XML. Elements you use must be declared in DTD. Declaration (DTD): <!ELEMENT universe ANY> <!ELEMENT galaxy (#PCDATA)> Valid XML fragments: <universe/> or <universe></universe> <universe>the whole universe</universe> <universe> <galaxy>galaxy1</galaxy> </universe>
Copyright IBM Corporation 2004

Figure 5-11. ANY Content Model

XM3014.1

Notes:
Contrary to what you might expect, the ANY content model does not allow you to put anything you like between the start and end tag. When you use the ANY content model, you must supply well-formed xml if what you supply has markup in it. Moreover, the elements that you use must be declared in the DTD as well. So for the third example on the foil, the <galaxy> element must be declared in the DTD for the document. To specify the ANY content model provide the word ANY for the content model.

5-12 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Elements Content Model


The elements content model is specified by content model particles. Content model particles are element names as represented by a and b and the occurrence indicators below.

<!ELEMENT name (particle structure) >


Particle sequence choice one one or more zero or more zero or one Syntax <!ELEMENT name (a,b)> <!ELEMENT name (a|b)> <!ELEMENT name (a)> <!ELEMENT name (a)+> <!ELEMENT name (a)*> <!ELEMENT name (a)?>

Note: a or b may be a composite particle, that is, a = (c,d)


Copyright IBM Corporation 2004

Figure 5-12. Elements Content Model

XM3014.1

Notes:
If the content of an element consists solely of child elements, the element is said to have element content. The element content model is specified by content model particles that are combinations of either element names or other content model particles. The table describes the operators that can be used to form these combinations. In the table, a or b can be either content particles or element names. To create the content model of a followed by b, use the comma (,). To create the content model of a or b, use the vertical bar (|). To repeat a content particle at least once, use the (+). To repeat a content particle zero or more times, use the (*). To allow a content particle to be absent or present exactly once, use the (?).

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-13

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Elements Content Examples (1 of 3)


Declaration:
<!ELEMENT person ((fname,lname)|(lname,fname))> <!ELEMENT fname (#PCDATA)> <!ELEMENT lname (#PCDATA)>

Valid XML:
<person> <lname>Smith</lname> <fname>John</fname> </person>

Also valid XML:


<person> <fname></fname> <lname>Smith</lname> </person>

& also valid XML:


<person> <fname/> <lname>Smith</lname> </person>
Copyright IBM Corporation 2004

Figure 5-13. Elements Content Examples (1 of 3)

XM3014.1

Notes:
The first example specifies that <person> has a content model that accepts an <fname> followed by an <lname> or an <lname> followed by an <fname>. The matches show all the possible permutations.

5-14 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Elements Content Examples (2 of 3)


Declaration:
<!ELEMENT order (order-item+,delivery-address,order-date?)> <!-- Child elements defined as containing #PCDATA -->

Valid XML fragments:


<order> <order-item>item1</order-item> <delivery-address>123 State Street</delivery-address> </order> <order> <order-item>item3</order-item> <order-item>item4</order-item> <delivery-address>123 State Street</delivery-address> </order> <order> <order-item>item5</order-item> <order-item>item6</order-item> <delivery-address>123 State Street</delivery-address> <order-date>July 5, 2001</order-date> </order>
Copyright IBM Corporation 2004

Figure 5-14. Elements Content Examples (2 of 3)

XM3014.1

Notes:
The second example specifies that an <order> is a sequence of at least one <order-item> followed by a <delivery-address>, followed by an optional <order-date>. The valid XML shows 1. One <order-item>, a <delivery-address> and no <order-date>. 2. Two <order-items> a <delivery-address> and no <order-date>. 3. Two <order-items>, a <delivery-address> and an <order-date>.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-15

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Elements Content Examples (3 of 3)


Declaration:
<!ELEMENT phonebook (page)+> <!ELEMENT page (heading, (entry|advert)+)> <!ELEMENT heading (#PCDATA)> <!ELEMENT entry (#PCDATA)> <!ELEMENT advert (#PCDATA)>

Valid XML fragment:


<phonebook> <page> <heading>The whole town</heading> <entry>John Smith, 555-1212</entry> <advert>Fred's Fish n' Chips - 123-4567</advert> </page> </phonebook>

Invalid XML fragments:


<phonebook><page><entry/><entry/></page></phonebook> <phonebook><page/></phonebook>
Copyright IBM Corporation 2004

Figure 5-15. Elements Content Examples (3 of 3)

XM3014.1

Notes:
This example says that a phone book is at least one <entry>, <column-heading> or <page-number>, but that there may be more than one of any of these three, and that they may appear in any order. The valid XML shows show: 1. Three <entry>'s. 2. Two <column-headings>. The invalid example is invalid because page-number cannot have entry as a child.

5-16 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Mixed Content Model


Mixed content: elements that contain character data optionally interspersed with child elements. Two Cases of Declarations: <!ELEMENT product (#PCDATA)> <!ELEMENT review (#PCDATA | product)*> Valid XML fragments: <review>review text goes here</review> <review>This is a review of some <product>car</product> that goes on for pages of regular text.</review>
character data char. data + child

Copyright IBM Corporation 2004

Figure 5-16. Mixed Content Model

XM3014.1

Notes:
Elements that have the mixed content model can contain (parsed) character data. In addition to the character data, mixed content models may also contain child elements interspersed with the character data. If a mixed content model contains child elements, it can specify which elements may appear, but the child elements can appear in any order, and any number of times. The valid XML shows: 1. An element with character data content only. 2. An element allowing a single child element in addition to the character data content.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-17

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

What Is allowed. . .Declaring Attributes


Attribute-list declarations Used to declare an element's attributes Syntax:
Option 1: <!ATTLIST elementName attributeName attributeType defaultDecl> Option 2: <!ATTLIST elementName attributeName attributeType defaultDecl ... attributeName attributeType defaultDecl>

Copyright IBM Corporation 2004

Figure 5-17. What Is Allowed. . .Declaring Attributes

XM3014.1

Notes:
The syntax for declaring attributes looks like this: <!ATTLIST followed by elementName - the name of element we are declaring that attribute for. attributeName - is the name of the attribute being declared. attributeType - specifies the data type (see Attribute Type table). attributeDefault - specifies the attribute's default behavior. To declare multiple attributes, you can write multiple ATTLIST declarations or repeat the (attributeName attributeType attributeDefault) part as necessary.

5-18 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Organizational Note
The next several charts identify possible choices for each syntactical piece on the previous chart These charts are followed by examples We then continue with the concepts identified in the "What is allowed in a DTD" chart: ENTITY ENTITIES NOTATION Our intent is to provide you with solid, tested examples you can use on your own projects The "XM301Lectures" folder on your desktop contain working examples you can try (or use) on your own

Copyright IBM Corporation 2004

Figure 5-18. Organizational Note

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-19

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Attribute Types
Attribute Type String Type

Description
Used to declare an attribute whose value may contain arbitrary character data. Whitespace crunching is not done. This is the only attribute type permitting attribute values that do not match the NAME production in the XML 1.0 grammar. Used to declare an attribute whose value must conform to the definition of a NAME in XML 1.0 Allows multiple NMTOKENs separated by white space. Used to declare an attribute whose value must be a unique within the XML document. The value of the attribute must refer to an ID value declared elsewhere in the document. IDREFS? See NMTOKENS Used to declare an attribute whose value must correspond to the name of a declared ENTITY. Allows multiple ENTITY names separated by whitespace. References a <!NOTATION declaration in the DTD. Attributes have a specified list of acceptable NMTOKEN values.
Copyright IBM Corporation 2004

CDATA Tokenized Type NMTOKEN NMTOKENS ID IDREF, IDREFS ENTITY ENTITIES Enumerated Type NOTATION ENUMERATION
Figure 5-19. Attribute Types

XM3014.1

Notes:
CDATA attributes contain character data. Whitespace crunching is not performed. We covered this on previous charts. The ID data type contains a string value that must be unique to each element. No element type may have more than one ID attribute specified, although the declared ID attribute may be #IMPLIED or #REQUIRED. ID valued attributes can be combined with IDREF and IDREFS valued attributes to create cross referencing within an XML document. IDREF's must contain values which are specified in an ID-valued attribute elsewhere in the document. IDREFS are a space separated list of ID values. ENTITY and ENTITIES are the name or a space separated list of entity names. (More on entities in a moment). NMTOKENs are strings composed of the legal characters in an XML element name -- they are not the same as XML element names, because the first character of an XML element name may not contain some of the characters that are legal as the first character of an NMTOKEN.
5-20 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

NMTOKENS are space separated lists of NMTOKENs. These were introduced earlier. NOTATION valued attributes must contain the name of a NOTATION declared elsewhere in the document. (More on NOTATION later). Enumerations are lists of NMTOKENs separated by the vertical bar (|) and enclosed in parentheses. An attribute with an enumeration value must contain one of the NMTOKENs in the list.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-21

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Attribute Default Declarations

Default Declaration #REQUIRED #IMPLIED attribute-value

Description The attribute must be present The attribute does not need to be present and no default value was supplied If the attributes value is not present, "attribute-value" is supplied as a default value If the attribute is present, it must have the value of "attribute-value"

#FIXED attribute-value

Copyright IBM Corporation 2004

Figure 5-20. Attribute Default Declarations

XM3014.1

Notes:
Every attribute must specify a default type. The possible values for the default type are: #REQUIRED: Indicates that the attribute must occur; the value may be enumerated or fixed. #IMPLIED: Indicates that the attribute or the attribute's value can remain unspecified; #FIXED value: Indicates that this attribute, when used, has a single (fixed) value, this value must appear immediately after the keyword and be in quotes. enumerated list: gives a list of choices in parentheses, each separated by an "or" operator. A default value (from the enumerated list) may be given after the list and must be in quotes. If a default value is declared, when the attribute is not present, the element is treated as if the attribute were present with the declared default value.

5-22 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Attribute Default Declaration Examples


Declaration:
<!ELEMENT shirt (#PCDATA)> <!ATTLIST shirt type CDATA #REQUIRED> <!ATTLIST shirt collar CDATA #IMPLIED> <!ATTLIST shirt size (small|medium|large) "large"> <!ATTLIST shirt manufacturer CDATA #FIXED "Levi">

Valid XML:
<shirt type="short">cotton</shirt> <shirt type="short" size="large">wool</shirt> <shirt type="short" manufacturer="Levi">denim</shirt> <shirt type="short sleeve" collar="button-down"></shirt>

Invalid XML:
<shirt></shirt> <shirt type="short" size="medium large">cardigan</shirt> <shirt type="short" manufacturer="Gap">designer</shirt>

Copyright IBM Corporation 2004

Figure 5-21. Attribute Default Declaration Examples

XM3014.1

Notes:
Here we've declared a few attributes with the various default types. Size has a default value, type is required, and manufacturer is fixed. Let's look at how the examples come out: For the valid examples: <shirt type="short"/> will also pickup the default value "large" for size, and the fixed value "Levi" for manufacturer <shirt type="short" size="large"/> will pick up the fixed value "Levi" for manufacturer For the invalid examples: <shirt/> is missing the required "type=" attribute <shirt type=short size="medium large"/> is invalid because "medium large" isn't in the enumerated value list for size <shirt type="short" manufacturer="Gap"/> is invalid because "Gap" isn't the fixed value for manufacturer
Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-23

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Attribute Alternate Declaration


Here is the same information presented using the second form of the attribute declaration statement. Declaration:
<!ELEMENT shirt (#PCDATA)> <!ATTLIST shirt size (small|medium|large) "large" collar CDATA #IMPLIED type CDATA #REQUIRED manufacturer CDATA #FIXED "Levi">

Copyright IBM Corporation 2004

Figure 5-22. Attribute Alternate Declaration

XM3014.1

Notes:

5-24 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Attribute Types: Tokenized Types: IDREFS Example


Syntax:
<!ATTLIST elementName attributeName IDREF defaultDecl>

Declaration:
<!ELEMENT <!ATTLIST <!ATTLIST <!ATTLIST employee employee employee employee (#PCDATA)> serialNumber ID #REQUIRED> manager1 IDREF #IMPLIED> manager2 IDREFS #IMPLIED>

Valid XML fragment:


<employee serialNumber="e00001">Joe Smith</employee> <employee serialNumber="e00002">Bill Smith</employee> <employee serialNumber="e00003" manager1="e00001">John Smith</employee> <employee serialNumber="e00004" manager1="e00001" manager2="e00002 e00001">John Smith</employee>

Invalid XML fragment: manager2="e00001 e00005" if e00005


is not an element within the document.
Copyright IBM Corporation 2004

Figure 5-23. Attribute Types: Tokenized Types: IDREFS Example

XM3014.1

Notes:
This foil shows a declaration for an implied attribute of type IDREFS. According to the syntax rules for IDs, numbers cannot be ID's. That is why the serialNumber values begin with a letter. Aside from naming rules, manager2 could have any value as long as there is an element with that value defined. Consequently, an employee could be self-managed! The uniqueness constraint applies to IDs not to IDREFs so the employee could be self-managed twice: both manager1 and manager2 could have the same value.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-25

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Attribute Types: Tokenized Types: ENTITY Example


Syntax:
<!ATTLIST elementName attributeName ENTITY defaultDecl>

Declaration:
<!ELEMENT employee (#PCDATA)> <!ATTLIST employee companyName ENTITY #REQUIRED> <!ENTITY company SYSTEM http://www.IBM.com/company.txt" NDATA txt> <!NOTATION txt SYSTEM "file:///C:/Windows/System32/notepad.exe">

Valid XML fragment:


<employee companyName="company">Joe Smith</employee>

ENTITY is also used in its own right as another element of a DTD; this is covered in subsequent charts. Here we focus on ENTITY as an attribute. NDATA and NOTATION are concepts we have yet to discuss.
The material above is included here to provide an example for future reference.
Copyright IBM Corporation 2004

Figure 5-24. Attribute Types: Tokenized Types: ENTITY Example

XM3014.1

Notes:
This foil shows a declaration for an implied attribute of type ENTITY. As you can see there are several concepts involved that we have yet to discuss. Not the least of which is "what is an 'entity'?" You will find this and the next chart useful on the job when you need to create or understand a DTD that uses these concepts. The concepts themselves are described on subsequent charts.

5-26 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Attribute Types: Tokenized Types: ENTITIES Example


Syntax:
<!ATTLIST elementName attribName ENTITIES defaultDecl>

Declaration:
<!ELEMENT employee (#PCDATA)> <!ATTLIST employee companyAtts ENTITIES #REQUIRED> <!ENTITY company "IBM"> <!ENTITY division "19"> <!ENTITY branch "http://www.ibm.com/IGS.txt">

Valid XML fragment: <employee companyAtts="company division branch">Joe Smith</employee>

Copyright IBM Corporation 2004

Figure 5-25. Attribute Types: Tokenized Types: ENTITIES Example

XM3014.1

Notes:
ENTITIES provide a mechanism for including data from multiple sources. As you can see there are several concepts involved that we have yet to discuss. You will find this and the next chart useful on the job when you need to create or understand a DTD that uses these concepts. While DTDs may be lacking in several important aspects (listed later), they can still be very complex! Like the ENTITY example, we need to define several concepts for this chart to be understood. The explanations follow.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-27

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

DTDs Part II

But first. . . BREAK ! ! !

Figure 5-26. DTDs Part II

XM3014.1

Notes:

5-28 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Declaring ENTITYs: an Internal, Parsed ENTITYs Example


Syntax:
<!ENTITY entityName "replacementText">

Usage:
&entityName;

Declaration:
<!ENTITY xmlExpert "Ron Smith"> <!ENTITY topic "XML Documents">

Valid XML:
<response>For additional help with &topic;, Please contact &xmlExpert;.</response>

Processed XML:
For additional help with XML Documents, Please contact Ron Smith.

Copyright IBM Corporation 2004

Figure 5-27. Declaring ENTITYs: an Internal, Parsed ENTITYs Example

XM3014.1

Notes:
Here is an example. But we just told you that entities are related to separate storage units, and the entity declaration that we just saw fit completely into the DTD. This kind of entity is called an internal entity and is not associated with a separate physical storage unit. Let's look at how to declare the same entity as an external entity, in a separate physical storage unit.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-29

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Declaring ENTITYs: an External, Parsed ENTITYs Example


Syntax:
<!ENTITY entityName SYSTEM "systemURI"> <!ENTITY entityName PUBLIC "publicURI" "systemURI">*
*refer to the Notes.

Declaration:
<!ENTITY copyrightInfo SYSTEM "file:///c:/legal/boilerplate.txt">

boilerplate.txt file:
Copyright 2003, IBM. All rights reserved.

Valid XML:
<notices>This application was developed using WebSphere Studio. &copyrightInfo;</notices>

Processed XML:
This application was developed using WebSphere Studio. Copyright 2003, IBM. All rights reserved.
Copyright IBM Corporation 2004

Figure 5-28. Declaring ENTITYs: an External, Parsed ENTITYs Example

XM3014.1

Notes:
In this case where the entity defines a public URI, the parser must understand how to handle the "publicURI" identifier. This is traditionally only used when the parser provided was hard-coded to handle it, or if you will be creating your own parser to handle entity replacement. According to 4.2.2 (External Entities) of the XML 1.0 specification: "Definition: In addition to a system identifier, an external identifier may [emphasis added] include a public identifier. An XML processor attempting to retrieve the entity's content may [emphasis added] use the public identifier to try to generate an alternative URI reference. If the processor is unable to do so, it must [emphasis added] use the URI reference specified in the system literal...." Here is their example: <!ENTITY open-hatch SYSTEM "http://www.textuality.com/boilerplate/OpenHatch.xml"> <!ENTITY open-hatch PUBLIC "-//Textuality//TEXT Standard open-hatch boilerplate//EN"
5-30 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

"http://www.textuality.com/boilerplate/OpenHatch.xml"> <!ENTITY hatch-pic SYSTEM "../grafix/OpenHatch.gif" NDATA gif > Find out more at: http://www.w3.org/TR/2003/PER-xml-20031030/ Be aware that an external entity may not recursively reference itself, either directly or indirectly. More examples follow.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-31

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Unparsed Entity Declarations: a Review


Syntax:
<!ENTITY entityName SYSTEM "URI" NDATA notationName>

Declaration:
<!NOTATION jpeg SYSTEM "file:///c:/Program Files/Photoshop/photoshop.exe"> <!ENTITY prod17792 SYSTEM "prod17792.jpg" NDATA jpeg> <!ELEMENT item EMPTY> <!ATTLIST item picture ENTITY #REQUIRED>

Valid XML:
<item picture='prod17792'/> Rules: Unparsed entities can only be external entities. In order to declare an unparsed entity, you start with a regular external entity declaration and before the closing angle bracket you insert NDATA and the name of a notation. This associates a notation name with the unparsed entity. To reference an unparsed entity, you can use its name in an ENTITY or ENTITIES valued attribute. You cannot reference an unparsed entity by &name;.
Copyright IBM Corporation 2004

Figure 5-29. Unparsed Entity Declarations: a Review

XM3014.1

Notes:
Here's an example of unparsed entity use: First we declare a notation called jpeg and associate it with a photoshop.exe somewhere on the local machine. Then we declare an external unparsed entity called prod17792 and add the NDATA jpeg clause to specify the notation. The rest of the DTD declares an empty element item with an ENTITY valued attribute called picture. You can see in the XML instance document that we supply prod17792 (the name of the entity) as the value of the picture attribute of item. This is how you can associate a piece of unparsed/binary data with a portion of an XML document.

5-32 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Parameter ENTITYs
Parameter entities: Can only be used in the DTD Allows reuse of attribute lists and complex type definitions Syntax:
<!ENTITY % parameterEntityName "replacementText">

Usage:
%parameterEntityName;

Declaration:
<!ENTITY % commonAtts "make CDATA model CDATA <!ELEMENT phone (#PCDATA)> <!ATTLIST phone %commonAtts type (rotary | touch-tone) #IMPLIED #IMPLIED"> #IMPLIED>

Processed DTD:
<!ELEMENT phone (#PCDATA)> <!ATTLIST phonemake CDATA #IMPLIED model CDATA #IMPLIED type (rotary | touch-tone) #IMPLIED>
Copyright IBM Corporation 2004

Figure 5-30. Parameter ENTITYs

XM3014.1

Notes:
The parameter entity replacement works like regular entity replacement. The parser will substitute the replacement text, and then continue evaluating the DTD from the point of replacement. Parameter entities are entities that are meant to be used in the DTD. Parameter entities are very useful if you want to reuse portions of an attribute list declaration or if you want to reuse parts of a complex content model specification. Parameter entities are the primary tool that is available to help you structure a complex DTD.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-33

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Parameter ENTITYs - Another Example


<!ENTITY % commonAtts "typeID ID make CDATA model CDATA <!ELEMENT <!ATTLIST <!ELEMENT <!ATTLIST <!ELEMENT <!ATTLIST car car computer computer phone phone #REQUIRED #IMPLIED #IMPLIED">

(#PCDATA)> %commonAtts;> (#PCDATA)> %commonAtts;> (#PCDATA)> %commonAtts; type (rotary|digital) #IMPLIED>

Copyright IBM Corporation 2004

Figure 5-31. Parameter ENTITYs - Another Example

XM3014.1

Notes:
In this example the commonAtts parameter entity is used to represent common attributes for the three different elements: car, computer and phone.

5-34 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

What Is Allowed. . . Declaring Comments


Use comments to clarify the semantics of elements and attributes for those who are using the DTD to define conforming XML documents. Syntax:

<!-- This is a comment -->


Whitespace may be used to format the comment:

<!-This is also a comment -->


Comments may not contain the charcter sequence "--". Therefore, they may not be nested.

Copyright IBM Corporation 2004

Figure 5-32. What Is Allowed. . . Declaring Comments

XM3014.1

Notes:
To insert a comment in a DTD (or an XML document for that matter) place the comment text inside <!-- and -->. Comments cannot be nested. The space after the <!-- is required, as is the space before -->. The characters "--" may not be used within the comment. This form of declaration is also usable within HTML, XML and XSL documents.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-35

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Joining a DTD to an XML Instance


Three ways to inform an XML instance that there is an associated DTD: 1. Embed the DTD content inside the XML instance; 2. Provide the URI where the DTD file resides; 3. Use a combination of 1. and 2. Best practice: if the DTD will override one or more attribute values (not advised), set the 'standalone' attribute in the XML declaration to 'no' as a warning to users that they need to be aware. Include a comment in the XML for each attribute whose value may be changed by the DTD file. If the DTD file is large, include a comment near the beginning for each element that overrides a value in the associated XML instance.

Copyright IBM Corporation 2004

Figure 5-33. Joining a DTD to an XML Instance

XM3014.1

Notes:
Overriding (changing) the data contained in an XML instance may cause confusion for other users of the instance. The application of an XSL transform or a processor program (for example, DOM, SAX, or similar) may be a better alternative.

5-36 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

External DTD Subset


DTD and XML as separate files: Filename: hello.xml
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE message SYSTEM "message.dtd"> <message> <greeting>Hello, World!</greeting> <farewell>Goodbye, World!</farewell> </message>

Filename: message.dtd
<!ELEMENT message (greeting,farewell)> <!ELEMENT greeting (#PCDATA)> <!ELEMENT farewell (#PCDATA)>

Copyright IBM Corporation 2004

Figure 5-34. External DTD Subset

XM3014.1

Notes:
Up until now we've described some of the contents of a DTD without showing how to actually place those declarations in a file so that they can be used to validate a document. Recall that the DTD may be in an external file, embedded directly in an XML file, or split across an external file and the XML file. Let's look at placing the DTD declarations in an external file. The part of the DTD that goes into the external file is called the external DTD subset. The external DTD subset is an entity even though DTD declarations are not elements. Therefore you need to supply a text declaration at the beginning of the external DTD subset. This is especially important if the document and the DTD are going to be using different character encodings. In the example below, the file message.dtd contains the declarations of three elements, message, greeting and farewell. The DTD may have it's own encoding declaration (which may be different from the encoding of documents that reference the DTD file). The file hello.xml references this DTD using a DOCTYPE declaration. The DOCTYPE declaration specifies the name of the root element of the document, message in the
Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-37

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

example. Note that the DOCTYPE declaration is what specifies the root element, not the DTD subset. This means that potentially any element declaration in the DTD can serve as the root element. It is up to the DOCTYPE writer to specify this. Following the name of the root element is the keyword SYSTEM followed by a URI reference that the local machine can use to locate the actual file containing the external DTD. Using an external file allows you to easily use the same DTD to validate many documents.

5-38 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Internal DTD Subset


DTD and XML as a combined file: Filename: hello.xml
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE message [ <!ELEMENT message (greeting,farewell)> <!ELEMENT greeting (#PCDATA)> <!ELEMENT farewell (#PCDATA)> ]> <message> <greeting>Hello, World!</greeting> <farewell>Goodbye, World!</farewell> </message>

Copyright IBM Corporation 2004

Figure 5-35. Internal DTD Subset

XM3014.1

Notes:
Up until now we've describe some of the contents of a DTD without showing how to actually place those declarations in a file so that they can be used to validate a document. Recall that the DTD may be in an external file, embedded directly in an XML file, or split across an external file and the XML file. Let's look at the placing DTD declarations in an external file. The part of the DTD that goes into the external file is called the external DTD subset. The external DTD subset is an entity even though DTD declarations are not elements. Therefore you need to supply a text declaration at the beginning of the external DTD subset. This is especially important if the document and the DTD are going to be using different character encodings. In the example below, the file message.dtd contains the declarations of three elements, message, greeting and farewell. The DTD may have its own encoding declaration (which may be different from the encoding of documents that reference the DTD file). The file hello.xml references this DTD using a DOCTYPE declaration. The DOCTYPE declaration specifies the name of the root element of the document, message in the
Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-39

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

example. Note that the DOCTYPE declaration is what specifies the root element, not the DTD subset. This means that potentially any element declaration in the DTD can serve as the root element. It is up to the DOCTYPE writer to specify this. Following the name of the root element is the keyword SYSTEM followed by a URI reference that the local machine can use to locate the actual file containing the external DTD. Using an external file allows you to easily use the same DTD to validate many documents.

5-40 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Split DTD Subsets


Embedding DOCTYPE declarations and the DTD within the XML file: Filename: hello.xml <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE message SYSTEM "message.dtd" [ <!ENTITY destination "cruel world"> <!-- overrides destination in message.dtd --> ]> <message> <greeting>Hello, &destination;</greeting> <farewell>Goodbye, &destination;</farewell> </message> Filename: message.dtd <!ELEMENT message (greeting, farewell)> <!ELEMENT greeting (#PCDATA)> <!ELEMENT farewell (#PCDATA)> <!ENTITY destination "World">
Copyright IBM Corporation 2004

Figure 5-36. Split DTD Subsets

XM3014.1

Notes:
The example on this foil shows a DTD with an entity called destination in both the internal and external subsets. The declaration for destination in the internal subset will override the declaration in the external subset, leaving the messages "Hello cruel world" and "good-bye cruel world" after entity expansion has occurred. This allows local entity declarations in the internal subset to override entity declarations in the external subset. A best practice would be to include a comment drawing attention to the intent of this internal subset to override a value set in the external subset.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-41

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Whitespace and DTDs


Whitespace is white space isn't it? Not if you are a validating XML processor. Whitespace in #PCDATA element content (between the same start and end tag pair) Only know this if you have a DTD Whitespace in non-character data content Whitespace not in #PCDATA data element content is ignorable.

Copyright IBM Corporation 2004

Figure 5-37. Whitespace and DTDs

XM3014.1

Notes:
Whitespace is white space isn't it? Not if you are a validating XML processor. There are two kinds of white space: Whitespace in #PCDATA element content (between the same start and end tag pair) you only know this if you have a DTD Whitespace in non-character data content Whitespace not in #PCDATA data element content is ignorable Parsers report whitespace and ignorable whitespace differently. The parser does not actually discard the ignorable white space -- this is the application's job. But the parser can use different data structures / callback routines in order to report ignorable versus not ignorable whitespace.

5-42 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Ignorable Whitespace Example


<?xml version='1.0'?> <!DOCTYPE example [ <!ELEMENT example (source-code)> <!ELEMENT source-code (#PCDATA)> ]> <example> <-- ignorable <source-code> <-- not ignorable int i; <-- not ignorable i = 0; <-- not ignorable </source-code> <-- ignorable </example>

Copyright IBM Corporation 2004

Figure 5-38. Ignorable Whitespace Example

XM3014.1

Notes:
This slide shows an example XML document and DTD, and shows which whitespace is ignorable and which whitespace is not. Again, it is up to the application to decide what to do about ignorable whitespace. An XML processor will report all of the whitespace and indicate whether or not it is ignorable or note.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-43

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Validating versus Non-validating Processors


Validating processors will validate an XML document using the DTD. Processors will report validity errors. Some behavior of parsers is up to implementors. Parsers have options: They check document entity including internal subset. They report well-formedness errors. If there is an external DTD subset, they may or may not:
Normalize attribute values. Replace internal entity text. Supply attribute defaults.

Copyright IBM Corporation 2004

Figure 5-39. Validating versus Non-validating Processors

XM3014.1

Notes:
Validating processors are straightforward. The XML spec tells implementors exactly what a validating processor must do (in fact, they must do everything). Non-validating processors have options because the XML spec says that a non-validating processor may do certain things, but is not required to do them. Unfortunately, every parser implementor has chosen a different subset of items from this list to implement, so every non-validating parser behaves just a little differently. A non-validating processor must check the document entity including the internal subset. If there is an external DTD subset, they may or may not: normalize attribute values from the external subset replace internal entity text from the external subset supply attribute defaults from the external subset Since the behavior of non-validating processors is up to implementors, you need to be careful when working with a non-validation processor if you have complicated attribute values or use entities.
5-44 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Example DTDs
W3C XHTML cXML B2B between procurement applications, e-commerce hubs and suppliers. RosettaNet Business processes between trading partners and properties for defining products. RDF Site Summary (RSS) Syndicating news articles. DocBook Production of documentation which can be rendered into multiple output formats. Open Financial Exchange(OFX). Electronic exchange of financial data.

Copyright IBM Corporation 2004

Figure 5-40. Example DTDs

XM3014.1

Notes:
Many organizations are producing DTD's for various applications. Here some examples: cXML - http://www.cxml.org - cXML is a streamlined protocol intended for consistent communication of business documents between procurement applications, e-commerce hubs and suppliers. The current standard includes documents for setup (company details and transaction profiles), catalogue content, application integration (including the widely-used PunchOut feature), original, change and delete purchase orders and responses to all of these requests, as well as new order confirmation and ship notice documents (cXML analogues of EDI 855 and 856 transactions). RosettaNet - http://www.rosettanet.org - RosettaNet Partner Interface Processes (PIPs) define business processes between trading partners. RosettaNet dictionaries provide a common set of properties for PIPs. The RosettaNet Business Dictionary designates the properties used in basic business activities. RosettaNet Technical Dictionaries
Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-45

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

provide properties for defining products. Product and partner codes in RosettaNet standards expedite the alignment of business processes between trading partners. RSS - http://www.purl.org/rss/1.0 - The RDF Site Summary format was originally developed by Netscape and is widely used across the World Wide Web for the purpose of syndicating news articles. DocBook - http://www.docbook.org - DocBook is an XML version of the SGML DocBook DTDs that are widely used in the production of documentation which can be rendered into multiple output formats. OFX - http://www.ofx.net - Open Financial Exchange is a unified specification for the electronic exchange of financial data between financial institutions, business and consumers via the Internet. Created by CheckFree, Intuit and Microsoft in early 1997, Open Financial Exchange supports a wide range of financial activities including consumer and small business banking; consumer and small business bill payment; bill presentment and investments, including stocks, bonds and mutual funds.

5-46 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

What's Wrong with DTDs?


No type support. #PCDATA can be any string of characters (except tags) DTD syntax is different from XML syntax. <!ELEMENT zip (#PCDATA)> There are some constraints DTDs cannot easily express: Element x can occur from 4 to 17 times XML schema addresses many of the limitations of DTDs. XML schema is now a W3C recommendation. Support for W3C Schema is new. Features include:
XML syntax, strong typing, constraints

Governs the grammar of the entire document. Can't localize grammars to specific fragments.

Copyright IBM Corporation 2004

Figure 5-41. What's Wrong with DTDs?

XM3014.1

Notes:
There are number of problems with DTD's, which are listed on the chart. These problems have led to the creation of a number of alternate languages for defining the structure of XML grammars. The two leading contenders are W3C's XML Schema, and OASIS's Relax NG.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-47

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Status of DTDs
Part of XML 1.0 Widely adopted Variations in XML processors in accordance with varying definition of non-validating

Copyright IBM Corporation 2004

Figure 5-42. Status of DTDs

XM3014.1

Notes:
DTD's are a part of the XML 1.0 recommendation. They are a stable technology and widely adopted. As we noted earlier there are variations in XML processors in accordance with varying definition of non-validating. Most XML parsers available today come with the capability to use DTDs to validate documents. XML Schema is the W3C approved replacement for DTD's, but this is a new technology and has not reached broad usage at the time of this writing.

5-48 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Tooling
Can use any text editor As long as the editor supports Unicode or the chosen encoding. WebSphere Studio Application Developer Provides guided editing for DTDs and documents that reference them Can generate a DTD from sample XML. Write sample XML that illustrates all the ways you'll use the data Supports document validation Free IBM Alphaworks tools to help you http://www.alphaworks.ibm.com/tech/xmlsqc Many validating parsers: Apache's Xerces for Java, C++, Perl Apache's Xerces Perl JAXP, Java XML Parser

Copyright IBM Corporation 2004

Figure 5-43. Tooling

XM3014.1

Notes:
The Tooling for DTDs is pretty simple at the base. You can use the same editor that you use to edit an XML file to edit a DTD. They are the same kind of text. There are also many tools for working with DTD's. IBM's alphaworks has a number of useful tools. The commercially available XML Spy is a popular graphical tool for working with XML, DTD's and XML Schema. There are many parsers that perform validation using a DTD. This is true of all of the parsers available from the Apache Software Foundation.

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-49

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Checkpoint Questions (1 of 2)
1. Which DTD entry correctly depicts phone number, with optional area code? a. <!ELEMENT phone ((areaCode)*, prefix, body)> b. <!ELEMENT phone (areaCode?, prefix, body )> c. <!ELEMENT phone?(areaCode, prefix, body )> d. <!ELEMENT phone (areaCode, (prefix, body)+)> 2. Which of the following is a limitation of DTD? a. Non-XML syntax. b. Does not easily allow range of values (that is, 5 to 1000 elements). c. Does not provide proper typing of values (that is, integer versus string). d. Does not permit Parameter Entity references. e. All of the above.

Copyright IBM Corporation 2004

Figure 5-44. Checkpoint Questions (1 of 2)

XM3014.1

Notes:

5-50 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Checkpoint Questions (2 of 2)
3. Which DTD entry correctly depicts an optional attribute named type for a pet element, that defaults to the value "dog"? a. <!ATTLIST pet type CDATA #IMPLIED> b. <!ATTLIST type dog CDATA #FIXED "dog"> c. <!ATTLIST pet type CDATA "dog"> d. <!ATTLIST pet (dog)? CDATA #REQUIRED>

Copyright IBM Corporation 2004

Figure 5-45. Checkpoint Questions (2 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-51

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Unit Summary
In this section you have learned: XML 1.0 DTDs Element declarations Attribute declarations Entity declarations
General Parameter

Notation declarations Comments The difference between validating and non validating processors Example DTDs Best Practices

Copyright IBM Corporation 2004

Figure 5-46. Unit Summary

XM3014.1

Notes:
In this section you have learned about: XML 1.0 DTD's Element declarations Attribute declarations Comments Entity declarations General Parameter Notation declarations The difference between validating and non validating processors Example DTD's Best Practices
5-52 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Copyright IBM Corp. 2001, 2004

Unit 5. Document Type Definition (DTD)

5-53

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

5-54 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit 6. XML Namespaces


What This Unit is About
This unit describes the XML Namespaces Facility.

What You Should Be Able to Do


After completing this unit, you should be able to: Describe the reasons for using namespaces Describe the syntax used in namespaces Define and illustrate an example using namespaces Define myths about namespaces Define problems with namespaces List and define the best practices to use when using namespaces Describe the status of namespaces in the industry

How You Will Check Your Progress


Accountability: Checkpoint Machine exercises

Copyright IBM Corp. 2001, 2004

Unit 6. XML Namespaces

6-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Unit Objectives
After completing this unit, you should able to: Describe the reasons for using namespaces Describe the syntax used in namespaces Define and illustrate an example using namespaces Define problems with namespaces List and define the best practices to use when using namespaces Describe the status of namespaces in the industry

Copyright IBM Corporation 2004

Figure 6-1. Unit Objectives

XM3014.1

Notes:

6-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Problem: Element and Attribute Names Can Be Ambiguous


Consider the following XML document:
<catalogEntry> <book> <title>this book</title> <isbn>0001</isbn> <author> <title>Dr.</title> <lastName>Expert</lastName> <firstName>Iman</firstName> </author> </book> </catalogEntry>

How does an application know that: The first occurrence of title is a book title. The second occurrence of title is a person's title. Need a way to eliminate the ambiguity for the purpose of processing.
Copyright IBM Corporation 2004

Figure 6-2. Problem: Element and Attribute Names can be Ambiguous

XM3014.1

Notes:
The double use of title in the example illustrates the need for a namespace solution in XML. We need to be able to tell that the two title elements in this document are not the same element. Even though the elements have the same name, they have different meanings to the application. Using the context to disambiguate the two uses is not a generally applicable solution.

Copyright IBM Corp. 2001, 2004

Unit 6. XML Namespaces

6-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Elaboration
Some possibilities: Adopt industry standard document formats and naming conventions This approach works at the document level, a good example is ebXML, refer to http://www.openapplications.org Problems: No industry is an island, industries interact: who decides? Naming standards down to the element/attribute level are too brittle Use verbose element names, that is, bookTitle, courtesyTitle Problem: naming becomes fundamentally difficult, there is no way to know if a name is already in use, further, the data and/or its model may not belong to the consuming application. Solution Use some name qualifier that is already established as unique, that is, a domain-name-qualified URI (uniform resource identifier). Domain names are already managed and maintained as unique. This approach was developed into XML Namespaces.

Copyright IBM Corporation 2004

Figure 6-3. Elaboration

XM3014.1

Notes:
URI's are not actually used for lookup, only as reference. The only purpose is to give the namespace a unique name. Sometimes the URI is a pointer to a web page, which provides information about the namespace, but this is not required. The URI is not looked up as part of XML parsing or processing. The application is responsible for deciding what to do with the names.

6-4

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Namespaces: The Big Idea


In concept, each element name and attribute name could be expressed as: URI+name, for example, <title> might become: <http://www.library.com/books:title> There are two problems with this format: 1. It is not well-formed XML under the 1.0 specification. 2. It is a lot of typing. If it were possible to create a synonym for the URI and replace occurrences of the URI with that synonym, the amount of typing would be reduced and, if handled correctly, the result would be compatible with XML 1.0 For example, specify books="http://www.library.com/books", and code the element as <books:title> This concept forms the basis of the XML Namespace specification.

These URI qualifiers are called Namespaces


Copyright IBM Corporation 2004

Figure 6-4. Namespaces: The Big Idea

XM3014.1

Notes:
URI, recall, is uniform resource identifier.

Copyright IBM Corp. 2001, 2004

Unit 6. XML Namespaces

6-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XML Namespaces
For the purposes of XML namespaces, URIs are considered identical when they match character for character. If URIs are different, they represent different Namespaces. Note: There is no network lookup associated with the use of URIs in this specification, it is a lexical convention only. URIs are not checked by the processor to ensure they exist. The Namespace specification deals with the mechanics of associating a URI qualifier (aka namespace) with element and attribute names to create two-part names that are unique and free of ambiguity.

The Namespace specification refers to these two-part names as Qualified Names or QNames
Copyright IBM Corporation 2004

Figure 6-5. XML Namespaces

XM3014.1

Notes:
URI's are not actually used for lookup, only as reference. The only purpose is to give the namespace a unique name. Sometimes the URI is a pointer to a web page, which provides information about the namespace, but this is not required. The URI is not looked up as part of XML parsing or processing. The application is responsible for deciding what to do with the names.

6-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Qualified Names (QNames)


QNames are used in place of element and attribute names. QNames have a prefix and a local part - they look like this: prefix:localPart
<books : title >

prefix localpart

At all times, the prefix should be thought of as shorthand for the actual URI/namespace. That is, the above is really <http://www.library.com/books:title>

Copyright IBM Corporation 2004

Figure 6-6. Qualified Names (QNames)

XM3014.1

Notes:
You can think of a QName like <books:title> as being equivalent to the following Clark Notation: {http://www.books.com/books}title

Copyright IBM Corp. 2001, 2004

Unit 6. XML Namespaces

6-7

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Declaring Namespaces (1 of 2)
The syntax of a namespace declaration is:

<prefix:elementName xmlns:prefix='URI'/>
The following example declares the namespace http://www.library.com/books, assigns it a prefix of 'books' and identifies the book element as a member of that namespace.

<books:book xmlns:books='http://www.library.com/books'/>
Attributes may also be assigned to a namespace. As with elements, attributes are prefixed as follows:

<books:book xmlns:books='http://www.library.com/books' books:hardcover='true'/>


Attributes are not automatically in a namespace
Copyright IBM Corporation 2004

Figure 6-7. Declaring Namespaces (1 of 2)

XM3014.1

Notes:
Note that you can declare a namespace on any element that you like, not just the root element.

6-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Declaring Namespaces (2 of 2)
Suppose a document without namespaces looked like:
<book hardcover='true'> <title>Tom Sawyer</title> </book>

One way to use a namespace is:


<books:book xmlns:books='http://www.library.com/books' books:hardcover='true'> <books:title xmlns:books='http://www.library.com/books'> Tom Sawyer </books:title> note that a prefix can be re-used, it will be </books:book> redefined if the second the URI is different

It is clear that declaring the namespace on every single element becomes unwieldy (and error prone).

Copyright IBM Corporation 2004

Figure 6-8. Declaring Namespaces (2 of 2)

XM3014.1

Notes:
Now let's look at example with nested elements.

Copyright IBM Corp. 2001, 2004

Unit 6. XML Namespaces

6-9

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Namespace Scope
When a namespace prefix is declared, it remains in scope for: Attributes of the element where it is declared. Child elements (and their attributes) of the element where it is declared. Unless the prefix is redefined on a nested element. QNames are still required, the namespace is not assumed. Applying this technique, the previous example becomes:
<books:book xmlns:books='http://www.library.com/books' books:hardcover='true'> <books:title> Tom Sawyer </books:title> </books:book>

Copyright IBM Corporation 2004

Figure 6-9. Namespace Scope

XM3014.1

Notes:
Note that every element or attribute name that is in the namespace has the appropriate namespace prefix in front of it.

6-10 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Default Namespaces
For situations where a majority of elements are associated with the same Namespace, a default namespace may be declared.

Syntax: <elementName xmlns='URI'/>


QNames are used to identify nested elements that are from a different namespace. The default may be respecified for each element scope Nesting is respected, that is, respecification does not influence the outerscope containing the nested elements. Default namespaces don't apply to attributes. Attributes have no namespace unless their names are qualified.

Copyright IBM Corporation 2004

Figure 6-10. Default Namespaces

XM3014.1

Notes:
Once you have specified the default namespace, all unprefixed elements in the scope of the default declaration are assumed to be in the namespace specified as the default. It is very important to note that default namespace declarations only apply to element names, not attribute names. In our example, we set the books namespace to the default and get rid of all the prefixes on element names. We still need the prefix on the attribute names because default namespaces don't apply to attributes.

Copyright IBM Corp. 2001, 2004

Unit 6. XML Namespaces

6-11

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Example - Default Namespaces


<book xmlns='http://www.library.com/books' xmlns:books='http://www.library.com/books' books:hardcover='true'> <title>Tom Sawyer</title> </book>

Attribute requires a QName despite the default namespace.

Copyright IBM Corporation 2004

Figure 6-11. Example - Default Namespaces

XM3014.1

Notes:
The result of these apparent duplications is to put the hardcover attribute inside a namespace.

6-12 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Documents with Multiple Namespaces


Document with three namespaces:
<book xmlns='http://www.library.com/books' xmlns:amazon='http://www.amazon.com/products'> <title>Tom Sawyer</title> <isbn xmlns='http://www.loc.gov/isbn'> 0140390839 </isbn> <amazon:skuNo>A25</amazon:skuNo> </book>

Note the variations in the form of the namespace declaration.

Copyright IBM Corporation 2004

Figure 6-12. Documents with Multiple Namespaces

XM3014.1

Notes:
All that we did to enable this was add two more namespace declarations, and then add the new elements and use the appropriate namespace prefix. In the case of the isbn element, we declared the namespace that it needed on the element itself -- you can declare namespaces on any element that you like, not just the root element. When you do this, the prefix is only good for the element it was declared on. You can also change the default namespace for a particular element by redefining the default namespace on that element. Again, the scope will be the element that the declaration is attached to.

Copyright IBM Corp. 2001, 2004

Unit 6. XML Namespaces

6-13

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Elements with No Namespace


What happens to the previous example with no default namespace?
<book xmlns='http://www.library.com/books' xmlns:amazon='http://www.amazon.com/products'> <title>Tom Sawyer</title> <isbn xmlns=""> 0140390839 </isbn> <amazon:skuNo>A25</amazon:skuNo> </book>

The xmlns="" syntax resets the default namespace for the scope in which it occurs. The <isbn> element is not in a namespace.

There is no default null namespace


Copyright IBM Corporation 2004

Figure 6-13. Elements with No Namespace

XM3014.1

Notes:
The unprefixed <title> element is in no namespace, because there is no default null namespace. In order to repair this example, we need to prefix title with the books namespace prefix again. WRONG!

6-14 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Attributes and Namespaces


Attributes are not affected by a default namespace declaration. Attributes on a single element must be unique.
<bad xmlns:ns1="http://www.w3.org" xmlns:ns2="http://www.w3.org" > <invalid att="1" att="2" /> <invalid ns1:att="1" ns2:att="2" /> </bad>

<good xmlns:ns1="http://www.w3.org" xmlns="http://www.w3.org" > <valid a="1" b="2" /> <valid a="1" ns1:a="2" /> </good>

Copyright IBM Corporation 2004

Figure 6-14. Attributes and Namespaces

XM3014.1

Notes:
There are two interacting rules that affect attributes and namespaces: Attributes are not affected by a default namespace declaration. Attributes on a single element must be unique. In the example above, the <bad> element is invalid because there are two unprefixed att attributes. In the second invalid element the two attributes are the same because ns1 and ns2 are two prefixes for the same namespace URI. Therefore, the two attribute names are identical. It should be obvious that the first <valid> element is valid -- a and b are unprefixed, and a is not the same as b. The second <valid> element is valid because the unprefixed attribute a is in no namespace (remember that default namespace declarations don't affect attributes), and the ns1:a attribute is in the http://www.w3.org namespace -- they are in different namespaces.

Copyright IBM Corp. 2001, 2004

Unit 6. XML Namespaces

6-15

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Namespace Processing
How does an XML parser deal with namespaces? Needs the right API
SAX2 DOM Level 2

The parser simply reports the prefix, localName, and URI associated with the element or attribute. It's up to your application to decide what to do. There are no validation rules associated with Namespaces - it depends on XMLSchema, DTD, or whatever grammar description language you are using.

Copyright IBM Corporation 2004

Figure 6-15. Namespace Processing

XM3014.1

Notes:

6-16 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Example: Use of Namespaces


Composition of a particular airplane in an airline fleet:
<united:airplanes xmlns="http://www.wingco.com" xmlns:united="http://www.ual.com" xmlns:boeing="http://www.boeing.com" xmlns:pratt="http://www.prattandwhitney.com" xmlns:goodyear="http://www.goodyear.com" xmlns:airbus="http://www.airbus.com" xmlns:rolls="http://www.rollsroyce.com" xmlns:pirelli="http://www.pirelli.com"> <boeing:airplane > <wing/> <pratt:engine/> <goodyear:tire/> </boeing:airplane> <airbus:airplane > <wing/> <rolls:engine/> <pirelli:tire/> </airbus:airplane> </united:airplanes>
Copyright IBM Corporation 2004

Figure 6-16. Example: Use of Namespaces

XM3014.1

Notes:
Here's an example of namespaces in use: Here we have an imaginary record that might be used in an airline's airplane fleet inventory. For each airplane, we want to know which manufacturer provided each major part of the airplane. This example shows how we could use namespaces to identify which components came from which manufacturers. An application that processed this document could then use the namespaces to determine which manufacturer's diagnostic equipment would be needed to perform a full maintenance cycle on a particular airplane. While not required, it is a best practice to collect all the namespace definitions in one place; especially in large, composite files.

Copyright IBM Corp. 2001, 2004

Unit 6. XML Namespaces

6-17

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Problems with Namespaces


Namespace recommendation after XML 1.0. DTDs don't integrate well. Must use QNames as element names in DTDs (remember, ":" is legal in an element name) If the prefix changes, the DTD must also change Testing the equality of namespaces is not handled by the parser. You have to test equality of URIs, you cannot just test equality of prefixes.

Copyright IBM Corporation 2004

Figure 6-17. Problems with Namespaces

XM3014.1

Notes:
Namespace recommendation after XML 1.0 - because the namespace recommendation came after XML 1.0, it's not really part of the spec. This means there are places where namespaces and XML 1.0 don't fit together. DTD's don't really integrate well - We've showed you an ad hoc solution for using a fixed set of namespaces with a DTD, but that solution doesn't really satisfy a lot of desires that users have for namespaces. Testing equality of namespaces is a pain - there's no easy way to test equality of two namespaces except to get the two namespace URIs and compare them character by character.

6-18 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Best Practices
When to use namespaces When the data requires uniqueness for application processing. When the need to combine a schema [TBD] with other grammars is necessary. Performance implications Namespace processing may slow down the parser and/or increases memory use. Don't use relative URIs for namespace identifiers. Pick the default namespace carefully. Don't declare more than one prefix for a namespace URI. Be careful with attributes when using namespaces. Collect the namespace declarations in one place, preferable near the top of the document.

Copyright IBM Corporation 2004

Figure 6-18. Best Practices

XM3014.1

Notes:
When to use namespaces: When you think your DTD/Schema will be used outside your organization. When you think you will need to combine your DTD/Schema with other grammars. As a practical note, this means that anybody doing serious grammar work really ought to be using namespaces. Performance implications: Namespace processing slows down the parser and increases memory usage. The parser needs to look at all the namespace declarations and QNames. Even if you turn off namespace processing in your parser, there will still be a performance impact because your input document will still be larger (because of namespace declarations and QNames) than if you were not using namespaces. Don't use relative URIs for namespace identifiers; they are deprecated post the namespaces recommendation.

Copyright IBM Corp. 2001, 2004

Unit 6. XML Namespaces

6-19

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Pick the default namespace for an element carefully - this can save a lot of work if you choose carefully. Don't declare more than one prefix for a namespace URI - there's no reason to do it and it will cause confusion to someone else. Be careful with attributes when using namespaces - remember that default namespaces do not apply to attributes.

6-20 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Status of Namespaces
XML namespaces became a recommendation of the W3C on January 14, 1999. Supported by SAX2 and DOM2 parsers relative to DTDs. Much better support with XML Schema.

Copyright IBM Corporation 2004

Figure 6-19. Status of Namespaces

XM3014.1

Notes:
Namespaces in XML Recommendation 1/1999 - it is a stable recommendation. Supported by most parsers relative to DTDs. Much better support with XML Schema. Namespaces are ready for use, especially now that XML Schema has reached recommendation status.

Copyright IBM Corp. 2001, 2004

Unit 6. XML Namespaces

6-21

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

More Information

Reference
http://www.rpbourret.com/xml/ NamespacesFAQ.htm http://www.xml.com/pub/a/2000/03/08/ namespaces/index.html http://www.jclark.com/xml/xmlns.htm http://www.w3.org/TR/REC-xml-names/

Description
XML Namespaces FAQ XML.com article about Namespace Myths James Clark's notes on XML Namespaces The XML Namespaces specification

Copyright IBM Corporation 2004

Figure 6-20. More Information

XM3014.1

Notes:

6-22 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Checkpoint Questions
1. Which is true of XML namespaces? (Select all that apply) a. They are stored in an Internet-based registry. b. They are associated with URIs c. They are integrated with DTDs d. They are integrated with XML Schema. 2. An XML namespace prefix (Select all that apply): a. Links to a schema definition. b. Is scoped to the element where it is defined. c. Is short hand for a URI. d. Can stand for more than one namespace. 3. Default namespaces apply to: a. Elements b. Attributes c. Elements and attributes d. Neither elements nor attributes

Copyright IBM Corporation 2004

Figure 6-21. Checkpoint Questions

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 6. XML Namespaces

6-23

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Unit Summary
Having completed this unit, you should understand: The reasons for using namespaces The syntax used in namespaces The use of default namespaces The interaction between namespaces and attributes Problems with namespaces Best practices regarding namespaces Status of namespace technology

Copyright IBM Corporation 2004

Figure 6-22. Unit Summary

XM3014.1

Notes:

6-24 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit 7. XML Schema


What This Unit is About
This unit presents an introduction to the essential features of the W3C XML Schema language.

What You Should Be Able to Do


After completing this unit, you should be able to: List and describe the reasons for using XML Schemas List the key new features of Schemas Define the grammar rules of an XML document using the syntax of XMLSchemas List and define the best practices to use when using XML Schemas Describe the status of XML Schemas in the industry

How You Will Check Your Progress


Accountability: Machine exercises

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-1

Student Notebook

Unit Objectives
After completing this unit, you should be able to: Understand what an XML Schema represents List and describe the reasons for using XML Schema List the key features of the XML Schema definition language Define the grammar rules of an XML document using the syntax of the XML Schema definition language List and define the best practices to use when using XML Schema Describe the status of XML Schema in the industry

Copyright IBM Corporation 2004

Figure 7-1. Unit Objectives

XM3014.1

Notes:

7-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Approach
We will break this unit into three parts. Part 1 (this part) will Introduce the philosophy behind the XML schema definition language Provide an introductory example using the more common constructs Motivate the need for more sophisticated additions Part 2 will Introduce the semantics of the more common constructs
Including the more common options

Provide examples of how Studio will help


We will continue the refinement of the introductory example as a practical example of how to employ the power of a schema

Part 3 will address the issues of Namespaces, schemas, and Qualification


Copyright IBM Corporation 2004

Figure 7-2. Approach

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-3

Student Notebook

What Is an XML Schema?


An XML Schema is: A document* created using the XML Schema definition language A document that conforms to the XML Schema specification http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ The XML Schema definition language is: Often abbreviated XSD language after the file extension .xsd by which XML schema are identified An XML-based markup language XSD elements and attributes are part of a namespace A prefix is used to reference these elements and attributes It is customary (but not mandatory) to use xsd: for this prefix * Recall that although "document" or "instance document" is commonly used to describe XML products, neither instances nor schemas need to exist as "documents" --they may be: Byte streams sent between applications Fields in a database record Collections of XML Infoset "information items" For simplicity they will be referred to as though they are documents and files.
Copyright IBM Corporation 2004

Figure 7-3. What Is an XML Schema?

XM3014.1

Notes:
The key points are: An XML schema represents something that was constructed according to specific rules, semantics, and so forth; A schema is itself compliant with the rules governing the construction of a well-formed XML instance; and We can apply all the useful concepts developed to increase the utility of XML against these schema. These XML compliant things are not necessarily persistent, touchable things -- they can be created, produce their effects and then be gone. Like the wind in a wind tunnel: we can create a wind, feel - and sometimes see - its effects, and then turn it off whence it disappears but leaving things in a quite different state. An XML Infoset is an abstract data set describing the information available from an XML document. Its purpose is to provide a consistent set of definitions for use in other specifications that need to refer to the information in a well-formed XML document.
7-4 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

For additional information refer to the infoset specification at: http://www.w3.org/TR/2001/WD-xml-infoset-20010316

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-5

Student Notebook

Why Do We Need XML Schema?


Business Desire for increased expressive power for data. Desire to combine vocabularies from different organizations. Lower cost of integration through improved communication. Technical Validate the XML document against a detailed specification. XML instance syntax. Data typing and constraints. Namespaces. Represent relationships among data elements.

Copyright IBM Corporation 2004

Figure 7-4. Why Do We Need XML Schema?

XM3014.1

Notes:
Business The arrival of XML as a language for data interchange between applications created the need to be able to specify richer semantics for XML documents. Similarly, in order to facilitate loosely bound data interchange between applications, there is a need to be able to combine rich grammars from different organizations in order to facilitate data interchange applications. A well-defined XML vocabulary can improve communication among organizations integrating using XML causing integrations to proceed more smoothly, quickly and at a lower cost. Technical Need a way to validate structure of incoming documents against a detailed specification. Cost of integration can be lowered as certain validations are moved to a validating parser and are not perpetuated into the backend systems.

7-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

The non-XML syntax used for DTDs makes it harder to write applications which manipulate grammar data. The symmetry of using XML syntax to represent information about the grammar allows all of the XML technologies to be applied to the grammar information itself. Data typing makes it possible for XML processors to verify more semantic constraints on the contents of XML documents. This will also allow future versions of XML processors to deal directly with typed data. Constraints allow additional validations to occur against the specification. The original XML 1.0 recommendation was followed by a recommendation for a namespace facility. Unfortunately, this meant that namespaces were not directly integrated with DTDs. XML Schema provides a way to reconcile namespaces and grammars. Need a way to represent relationships among data elements similar to what we do with database tables or objects.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-7

Student Notebook

DTD versus XML Schema (1 of 2)


Consider an element called quantity that some application expects to receive as a positive integer. The DTD declaration of quantity is: <!ELEMENT quantity (#PCDATA)> This declaration permits quantity to have any string value. It simply states that the element exists in the document. In a schema, the declaration for quantity might look like:
<xsd:schema xmlns:xsd='http://www.w3.org/2001/XMLSchema'> <xsd:element name='quantity' type="xsd:nonNegativeInteger"/> <xsd:/schema>

The DTD declaration would find the following XML to be valid:


<quantity>4</quantity> <quantity>-2</quantity> <quantity>lots</quantity>

Only the schema declaration would find only the first example valid.
Copyright IBM Corporation 2004

Figure 7-5. DTD versus XML Schema (1 of 2)

XM3014.1

Notes:
This document shows how to declare an element that has a simple type. A schema begins with a <schema> tag. This tag also has a namespace declaration that says that the default namespace is the namespace for XML Schema (denoted by the URI http://www.w3.org/2001/XMLSchema). Much more will be presented on this later: this is only an introduction. An element is declared using the <element> tag. The type attribute on the <element> tag specifies the type that is used for the element. nonNegativeInteger is a built in simple type (In this example, the type is nonNegativeInteger, which is one of the built in simple types.) We will cover most of the key aspects of the XSD language in this chapter so do not focus on details at this time.

7-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

DTD versus XML Schema (2 of 2)


Schemas are orders of magnitude more powerful than DTDs, which means: Their associated constructs are much more complicated; An effective way to master them is by example and experience; The XML Schema specification is considerably more lengthy. A list of the XML Schema requirements specified to create the XML schema language follows.

Copyright IBM Corporation 2004

Figure 7-6. DTD versus XML Schema (2 of 2)

XM3014.1

Notes:
It is difficult, but not impossible, to master schemas by reading and interpreting the W3C specification. It is much easier, for most of us, to master enough concepts to get started, often from tested examples, and then add complexities as we need them. Most of what you want to implement can be accomplished -- it is usually just a matter of searching the very large body of schema information until you find something that does what you want to do.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-9

Student Notebook

Requirements Applied to the XSD Language (1 of 3)


Three categories Structural Datatype Conformance Structural requirements The XML schema language must define: Mechanisms for constraining document structure (namespaces, elements, attributes) and content (datatypes, entities, notations); Mechanisms to enable inheritance for element, attribute, and datatype definitions; Mechanism for URI reference to standard semantic understanding of a construct; Mechanism for embedded documentation; Mechanism for application-specific constraints and descriptions; Mechanisms for addressing the evolution of schemata; Mechanisms to enable integration of structural schemas with primitive data types.
Copyright IBM Corporation 2004

Figure 7-7. Requirements Applied to the XSD Language (1 of 3)

XM3014.1

Notes:
The information on the charts come from the XML Schema Requirements document which is found at http://www.w3.org/TR/1999/NOTE-xml-schema-req-19990215#Requirements. XML Schema uses XML instance syntax to describe the rules for the grammar of an XML Schema document. XML Schema documents allow a rich variety of data types to be used to constrain both element and attribute content. XML Schema documents allow us to specify which namespace declarations and definitions belong in, importation of declarations and definitions from another namespace, and wildcarding of declarations from other namespaces. In XML Schema, type definitions and element and attribute declarations are separated from each other, allowing reuse of type definitions. XML Schema documents allow us to specify constraints on the uniqueness of values of a particular type, as well as relationships between those unique values and values of other types.
7-10 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XML Schema document and XML Schema are used interchangeably as are simply, schema. Expect to also see XML Schema definition, XML schema definition document, XML Schema description, and also an XSD or xsd. The key idea is simply that an XML instance, in order to be valid, has an associated DTD or schema. The XML may be recognized by its .xml extension, the DTD by its .dtd extension, and the schema by its .xsd extension -- assuming one does not have any of these instances open for inspection.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-11

Student Notebook

Requirements Applied to the XSD Language (2 of 3)


Datatype requirements The XML schema language must: Provide for primitive data typing, including byte, date, integer, sequence, SQL and Java primitive data types, and so forth; Define a type system that is adequate for import/export from database systems (for example, relational, object, OLAP); Distinguish requirements relating to lexical data representation versus those governing an underlying information set; Allow creation of user-defined datatypes, such as datatypes that are derived from existing datatypes and which may constrain certain of its properties (for example, range, precision, length, mask).

Copyright IBM Corporation 2004

Figure 7-8. Requirements Applied to the XSD Language (2 of 3)

XM3014.1

Notes:

7-12 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Requirements Applied to the XSD Language (3 of 3)


Conformance The XML schema language must: Describe the responsibilities of conforming processors; Define the relationship between schemas and XML documents; Define the relationship between schema validity and XML validity; Define the relationship between schemas and XML DTDs, and their information sets; Define the relationship among schemas, namespaces, and validity; Define a useful XML schema for XML schemas;

Copyright IBM Corporation 2004

Figure 7-9. Requirements Applied to the XSD Language (3 of 3)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-13

Student Notebook

Anatomy of an XML Schema


The schema below declares this structure:
<elem1 attr1="hello"> <elem2>some text here</elem2> </elem1>
root node is always 'schema' Namespace that instance docs should use when referencing this schema Element with built-in 'simple' type

Declaration Namespace

<xsd:schema

targetNamespace="http://some.nameSpace/URI" xmlns="http://some.nameSpace/URI" xmlns:xsd="http://www.w3.org/2001/XMLSchema">

schema Namespace declaration

<xsd:element name='elem2' type="xsd:string"/>

<xsd:element name='elem1'> <xsd:complexType> Complex <xsd:sequence> Element <xsd:element ref='elem2'/> </xsd:sequence> <xsd:attribute name='attr1' type='xsd:string'> Attribute </xsd:complexType> declaration with built-in </xsd:element> </xsd:schema>
Copyright IBM Corporation 2004

'simple' type

Figure 7-10. Anatomy of an XML Schema

XM3014.1

Notes:
This chart identifies the most common ideas we will encounter inside an XML schema. Once more, do not focus on the details at this point. We only wish to present the overall picture in these first few introductory charts.

7-14 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

A Simple XML Document - Return to Basics


Without going into detail (yet), consider this XML example:
<?xml version="1.0"?> <purchaseOrder orderDate="2003-10-20"> <shipTo country="US"> <name>Jane Doe</name> <street>123 Main Street</street> <city>Newark</city> <state>DE</state> <zip>21721</zip> </shipTo> <billTo country="US"> <name>John Doe</name> <street>8 Oak Avenue</street> <street2>Apt. 2B</street2> <city>Elkton</city> <state>MD</state> <zip>21921</zip> </billTo> <comment>Please rush!</comment> <items> <item partNum="87654-DOA"> <productName>Lawnmower</productName> <quantity>1</quantity> <USPrice>249.99</USPrice> <comment>Confirm this is low-pollution</comment> </item> <item partNum="926-78"> <productName>Wetness Monitor</productName> <quantity>1</quantity> <USPrice>39.98</USPrice> <shipDate>2003-11-21</shipDate> </item> </items> </purchaseOrder>

Let us review the basic nomenclature that applies:


Copyright IBM Corporation 2004

Figure 7-11. A Simple XML Document - Return to Basics

XM3014.1

Notes:
The po.xml source file is included in the XM301 Lectures folder inside Unit 7. The vertical bars were added to identify the various scopes. Color and thickness are used to differentiate the three principal levels involved. A more detailed description follows on the next chart.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-15

Student Notebook

A Simple XML Document Basic Nomenclature


The XML instance on the previous page consists of . . . One main element purchaseOrder; aka root element subelements shipTo, billTo, comment, and items ShipTo, billTo, and items contain subelements Item, a subitem of items, in turn contains its own subelements PurchaseOrder, the root, also has subelements PurchaseOrder, shipTo, billTo and item carry attributes orderDate, country, and partNum as well as subelements At some level (name, for example) an element carries only a number, string, date, and so forth.* But NO subelements (or attributes) Such elements are said to have simple types Elements that contain subelements and/or carry attributes are said to have complex types Attributes always have simple types (that is, they are numbers, strings, dates, and so forth.)
* See the chart "Simple Types Built-in to XML Schema" on a subsequent page for a complete list of their names (for this version of the XML Schema specification).

Copyright IBM Corporation 2004

Figure 7-12. A Simple XML Document - Basic Nomenclature

XM3014.1

Notes:
We first saw these definitions in chapter 3 on XML basics. Keep in mind that here we are talking about the XML instance and not the schema that defines it.

7-16 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

A Simple XML Document Basic Schema Concepts


The [elements with] complex types in the instance document are defined in the schema for the instance document. The XML instance document is po.xml. The XML Schema Definition document, or schema, is po.xsd. Some of the [elements with] simple types are also defined in the schema. The other simple types, namely the types carried by the attributes, are defined from the repertoire of simple types that are built-in to the XML schema definition language. Very shortly we will show you the result of using Studio to generate one possible XSD document from the XML instance po.xml. At this stage note that there is nothing in the po.xml file that ties it to the po.xsd file generated by Studio. Later on we will introduce explicit mechanisms for associating instances and schemas.
Copyright IBM Corporation 2004

Figure 7-13. A Simple XML Document - Basic Schema Concepts

XM3014.1

Notes:
It is common usage to not include the words in brackets. Some of us find the absence of these words leads to confusion. In an effort to avoid this confusion some of us speak in terms of the content of the element: An element that contains subelements and/or attributes is referred to as an element with complex content; an element that contains only one of the predefined XSD types is referred to as an element with simple content. An attribute is then referred to as a simple type, which is still shorthand that really refers to the value that the attribute carries. Although it is a challenge, try not to get too wrapped up in the details of the nomenclature; instead, focus on the patterns that are involved. The po.xml file in the XM301 Lectures project file has the necessary information to form the association to its associated schema, po.xsd so that when you select po.xml Studio will report it as valid.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-17

Student Notebook

Simple Types Built-in to XML Schema


This is a list of the names of the simple types predefined for XSDs Before employing any of them refer to the W3C specifications at www.w3.org/TR/2001/REC-xmlschema-{1 / 2}-20010502 A primer may also be found at .../REC-xmlschema-0-20010502
string normalizedString token byte unsignedByte base64Binary hexBinary integer positiveInteger negativeInteger nonNegativeInteger nonPositiveInteger int unsignedInt long unsignedLong short unsignedShort decimal float double boolean time dateTime duration date gMonth gYear gYearMonth gDay gMonthDay Name QName NCName anyURI language ID IDREF IDREFS ENTITY ENTITIES NOTATION NMTOKEN NMTOKENS
Copyright IBM Corporation 2004

Figure 7-14. Simple Types Built-in to XML Schema

XM3014.1

Notes:
There is considerably more to the use of simple types than is implied by their name. We shall examine the specifics of this statement later on as we parse the .xsd that a tool like WebSphere Studio can automatically generate from an XML instance. Appendix B to this course has a snippet from the actual table found in the 20010502 version of the Primer. Again, recognize that any/all URLs are subject to change. Always check the basic W3C Web site for the latest version.

7-18 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

A Simple XML Document - How Studio Sees It


If the XSD language really is a language it must have: Regular syntax Grammar rules Which means we can create tools that automatically create associated .xsd files from (well-formed) .xml files Here is the start of the .xsd file Studio 5.1 created based on the po.xml file:
<?xml version="1.0" encoding="UTF-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="USPrice" type="xsd:string"/> <xsd:element name="billTo"> <xsd:complexType> <xsd:sequence> <xsd:element ref="name"/> <xsd:element ref="street"/> <xsd:element ref="street2"/> <xsd:element ref="city"/> <xsd:element ref="state"/> <xsd:element ref="zip"/> </xsd:sequence> <xsd:attribute name="country" type="xsd:string" use="optional"/> </xsd:complexType> </xsd:element> <xsd:element name="city" type="xsd:string"/> <xsd:element name="comment" type="xsd:string"/>

Copyright IBM Corporation 2004

Figure 7-15. A Simple XML Document - How Studio Sees It

XM3014.1

Notes:
Well-formed XML is governed by rules; XSDs are governed by rules. So it should be possible to create software that is aware of the rules that govern an XSD such that we can infer a schema that would validate an XML instance. This is not new: we saw this same capability with DTDs: given a well-formed XML it was possible to use a tool such as Studio that knows the DTD rules to generate a DTD file that would validate the XML source on which it was based. We also saw that the generated DTD only reflected what it could understand: we had to edit the DTD file to capture all that we knew about the XML source. This example carries on over a dozen or more pages. Again, focus on patterns not details -- they will come later. Try to keep the big picture in mind to avoid getting lost in the details.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-19

Student Notebook

A Simple XML Document - XSD Part 1 (1 of 7)


The first line of the purchase order schema is the XML Declaration Of course! This is an XML file. The second line declares the root element to be xsd:schema Schema [xsd:schema] is a complex type The prefix is xsd by convention; a different prefix could be used By convention the prefix is associated with the XML Schema namespace xmlns:xsd="http://www.w3.org/2001/XMLSchema"
Schema is special; here is the typical structure of a schema:
<schema attributeFormDefault = (qualified | unqualified) : unqualified blockDefault = (#all | List of (extension | restriction | substitution)) :' ' elementFormDefault = (qualified | unqualified) : unqualified finalDefault = (#all | List of (extension | restriction)) : ' ' id = ID targetNamespace = anyURI version = token xml:lang = language {any attributes with non-schema namespace . . .}> Content: ((include | import | redefine | annotation)*, (((simpleType | complexType | group | attributeGroup) | element | attribute | notation), annotation*)*) </schema>
Copyright IBM Corporation 2004

Figure 7-16. A Simple XML Document - XSD Part 1 (1 of 7)

XM3014.1

Notes:
Since an XSD conforms to XML of course best practices dictate we should begin with an XML declaration. Since we built the shell in Studio it automatically included the UTF-8 value for the encoding attribute. "[A] different prefix could be used. . ." There is an example within Studio in a "help" file where simply 'x' is used as a prefix: there is no normative statement that "xsd" must be used as the prefix. Out of kindness to our peers, it is a good idea to stick with "xsd" since it is easily recognized. The key aspect is that the purpose of the association of the prefix with the URI stated on the chart is to identify the elements and simple types as belonging to the vocabulary of the XML Schema language and not to the vocabulary of the schema author. If the latter were the case, the specification of what it means to be a decimal (for example) could differ from what the XML Schema language defines it to be in its specification. Henceforth we will refer simply to a complex type or a simple type element and leave it to you to do the translation: "element of complex type. . ." and so forth.

7-20 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

The keyword schema is very special; see the XML Schema specification or the W3C associated Primer, which has hot links to the appropriate part(s) of the specification. This snippet is directly from the 5/2/2001 Schema specification Part 1. One pair of parentheses is larger than the others to help you keep track because of the nesting. Consider the snippet in the last bullet to be but a peek into the depth a simple specification can be taken!

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-21

Student Notebook

A Simple XML Document - XSD Part 1 (2 of 7)


Everything to the left of the "=" sign is an attribute name; right-side: | still means or ; * means 0 or more; ? means optional; :something means something is the default; { } means you can have one (or some) of these, too. Question: Which of the previous list is the piece in red in the schema element <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> ? Answer: "any attributes with non-schema namespace" "xmlns:xsd" is an attribute name; its value, a URI, is a built-in simple type (anyURI) in the list previously included xmlns: is certainly a syntatic device but it is not given keyword status The first four attributes + targetNamespace are a few of the possible attributes reserved specifically for the XML Schema element itself. The Primer lists 34 We will describe some of them as necessary for our introduction
Copyright IBM Corporation 2004

Figure 7-17. A Simple XML Document - XSD Part 1 (2 of 7)

XM3014.1

Notes:
Refer to the previous notes for direction to additional information. Part of the difficulty in mastering XML schemas is knowing when something is special as opposed to looking like something that should be special. Fortunately, we can rely on Studio to provide quite a lot of syntactic guidance.

7-22 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

A Simple XML Document - XSD Part 1 (3 of 7)


The next element, a simple type, defines USPrice This may seem odd since USPrice doesn't occur until almost the end of the XML instance It simply reflects a natural result of parsing: Studio parsed the stream represented by the po.xml file; apparently it found a token that caused Studio to pause and perform some processing: the order in the schema is not determined by the order in the XML file A better type would be decimal -- or, because of the namespace, xsd:decimal Less-than-optimal type choices are often the result of "robotic" applications of parsing rules. If you rely on the automated features of any similar tool. . . Expect to edit the result to apply your knowledge of the XML file; And your knowledge of the problem domain; And the knowledge of your domain experts.

Copyright IBM Corporation 2004

Figure 7-18. A Simple XML Document - XSD Part 1 (3 of 7)

XM3014.1

Notes:
The order in which definitions occur in the schema for the .xml instance is generally not bound by the order in which the elements occur in the instance itself. One can only surmise that Studio only looks at the first occurrence of a type and picks the most general candidate for the type inasmuch as USPrice occurs twice and both times its value is clearly of type decimal. String clearly includes any kind of numeric type. A tool like Studio can make our XSD construction much faster and easier but it does not do all of our work. Hence, the admonition that we have to edit the machine output to comply with our knowledge of the problem together with the knowledge of our domain/subject matter experts. As a last step we will comply with the three subbullets of the last bullet.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-23

Student Notebook

A Simple XML Document - XSD Part 1 (4 of 7)


The next element is an element, of complex type, that defines the billTo element, which should be of complex type, in po.xml The pattern to recognize is:
<xsd:element name="billTo"> <xsd:complexType> <xsd:sequence> <xsd:element ref="name"/> <xsd:element ref="street"/> <xsd:element ref="street2"/> <xsd:element ref="city"/> <xsd:element ref="state"/> <xsd:element ref="zip"/> </xsd:sequence> <xsd:attribute name="country" type="xsd:string" use="optional"/> </xsd:complexType> </xsd:element>

There are four scopes involved here 1. The element whose name is billTo (associating it to the .xml file) 2. The complexType declaration element . . .see next chart. . .
Copyright IBM Corporation 2004

Figure 7-19. A Simple XML Document - XSD Part 1 (4 of 7)

XM3014.1

Notes:
As you can see the level of complexity is rapidly increasing. As we proceed through this example always keep in mind that we have an XML instance with its own simple/complex/attribute structure and now, in the associated XML Schema, we have another XML-like instance with its own simple/complex/attribute structure plus (as you will see) a growing collection of baggage to be employed to bring precision to what could be myriad instances of the XML structure of which po.xml is but one instantiation.

7-24 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

A Simple XML Document - XSD Part 1 (5 of 7)


Continuing. . .
The complex element definition typically consists of
A set of element declarations Element references Attribute declarations

The declarations are not themselves types they represent an association between
A name and Constraints that govern the appearance of that name in the associated .xml <xsd:element ref="street2"/> , for example, could have been written as <xsd:element name="street2" type="xsd:string"/>

More details follow on a subsequent chart

The sequence element; this comes under the category of Model Group Schema Component in the specification
For our purposes we need three choices:
Sequence that requires the subelements (which could themselves be complex) must all appear and must do so in the order they are listed as subelements of the sequence element; All that requires all the subelements must appear but in no particular order; Choice that requires one of the subelements must appear. . . .see next chart . .
Copyright IBM Corporation 2004

Figure 7-20. A Simple XML Document - XSD Part 1 (5 of 7)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-25

Student Notebook

A Simple XML Document - XSD Part 1 (6 of 7)


Continuing. . .
The country attribute is part of the same scope as the sequence element
Its appearance is similar to that in a DTD We will cover its declaration in detail later

Six occurrences of an element element:


The subelements of billTo are the simple elements name, street, street2, city, state, zip
They have built-in type content only The built-in types will need to be specified They contain references to global elements

Studio chose to use a more general construct that uses a reference rather than an inline definition for the elements
That choice requires another statement to associate some built-in type with the ref= name, which here is either name, street, street2, city, state, or zip On the other hand if we wanted to create some lengthy/complicated value, we would only have to do it once and then we could ref= it Notice that the rest of the declarations come immediately before the end of the schema </xsd.schema>.

The value of the ref attribute must reference a global element


That is, one that is a subelement of the schema (root) element and not a part of a complex type definition.
Copyright IBM Corporation 2004

Figure 7-21. A Simple XML Document - XSD Part 1 (6 of 7)

XM3014.1

Notes:
Similar enough to be confusing, that is. The po.xsd file represents but one of many possible schemas that would find our po.xml file valid. Refer to the Primer associated XMLSchema 1.0 for many other possibilities. The alternatives, while interesting, would require at least one full week to examine. We should reorder Studio's output into something that is more intuitive to us. We should also change the types if we have better knowledge of what is required. We will perform both of these functions before we declare we are finished with the po.xsd file. Last bullet: note the indentation level (assuming your schema is properly formatted) of the element that defines any ref=: it is a child of the root element, schema or xsd:schema.

7-26 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

A Simple XML Document - XSD Part 1 (7 of 7)


The final statement is <xsd:element name="comment" type="xsd:string"/> This statement only defines what it means to be a "comment" in the purchase order. The actual placement of a comment is defined much later using a ref= Note that comment is an element in purchase order: it is not an XML comment <!-- -->

Copyright IBM Corporation 2004

Figure 7-22. A Simple XML Document - XSD Part 1 (7 of 7)

XM3014.1

Notes:
Gotcha!?

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-27

Student Notebook

A Simple XML Document - XSD Part 2 (1 of 4)


<xsd:element name="comment" type="xsd:string"/> <xsd:element name="item"> <xsd:complexType> <xsd:sequence> <xsd:element ref="productName"/> <xsd:element ref="quantity"/> <xsd:element ref="USPrice"/> <xsd:choice> <xsd:element ref="comment"/> <xsd:element ref="shipDate"/> </xsd:choice> </xsd:sequence> <xsd:attribute name="partNum" type="xsd:string" use="optional"/> </xsd:complexType> </xsd:element> <xsd:element name="items"> <xsd:complexType> <xsd:sequence> <xsd:element maxOccurs="unbounded" minOccurs="1" ref="item"/> </xsd:sequence> </xsd:complexType> </xsd:element>

Copyright IBM Corporation 2004

Figure 7-23. A Simple XML Document - XSD Part 2 (1 of 4)

XM3014.1

Notes:
The first line here is the last line from several pages ago.

7-28 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

A Simple XML Document - XSD Part 2 (2 of 4)


Looking back at the purchase order XML instance for the element item: It is complex for certain since it carries both subelements and an attribute There are two occurences of it Both occurrences contain these same simple type elements in the same order
prouductName quantity USPrice

The first occurrence also contains a simple type element comment The second occurrence instead contains a simple type element shipDate All these elements use ref= so expect to see definitions Now look at the schema definitions on the preceding page: Studio again uses a sequence construct However, when Studio digested the comment-or-shipDate elements it chose to model this idea as a choice element
Copyright IBM Corporation 2004

Figure 7-24. A Simple XML Document - XSD Part 2 (2 of 4)

XM3014.1

Notes:
Recall, we can have sequence, choice, or all. We will summarize these choices after we complete our walkthrough of the purchase order example.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-29

Student Notebook

A Simple XML Document - XSD Part 2 (3 of 4)


The last declaration within item is for the attribute partNum Let's look at the attribute defined in the po.xml file: <item partNum="87654-DOA"> Studio handled this as <xsd:attribute name="partNum" type="xsd:string" use="optional"/>
Studio now was able to "look" one scope-level up; that is, now that an item is defined, how are they to appear in the po.xml file? The next chart continues this thought and concludes this piece of the schema. . .

Copyright IBM Corporation 2004

Figure 7-25. A Simple XML Document - XSD Part 2 (3 of 4)

XM3014.1

Notes:
Later we will also present the various alternatives to "optional."

7-30 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

A Simple XML Document - XSD Part 2 (4 of 4)


Studio chose to define the complex type element items that consists of the subelements item next In the po.xml file items consists of two occurrences of item Studio can not know from this limited information how many occurrences of item might occur so it models it as a complex type:
<xsd:element name="items"> <xsd:complexType> <xsd:sequence> <xsd:element maxOccurs="unbounded" minOccurs="1" ref="item"/> </xsd:sequence> </xsd:complexType> </xsd:element>

Studio is guessing that there may be any number of occurrences of item (that is, unbounded) and it is guessing that there will always be at least one. ref="item" associates this declaration with the specification of item directly above this specification of items We are beginning to see that there are facets that can be applied to elements. The facets available are context sensitive.
Copyright IBM Corporation 2004

Figure 7-26. A Simple XML Document - XSD Part 2 (4 of 4)

XM3014.1

Notes:
The specification of an element is really quite complicated. Unless you wish to become the expert on XML schema, you need only master a "reasonable" subset of all the possible choices. We will add detail later.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-31

Student Notebook

A Simple XML Document - XSD Part 3 (1 of 2)


<xsd:element name="items"> <xsd:complexType> <xsd:sequence> <xsd:element maxOccurs="unbounded" minOccurs="1" ref="item"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="name" type="xsd:string"/> <xsd:element name="productName" type="xsd:string"/> <xsd:element name="purchaseOrder"> <xsd:complexType> <xsd:sequence> <xsd:element ref="shipTo"/> <xsd:element ref="billTo"/> <xsd:element ref="comment"/> <xsd:element ref="items"/> </xsd:sequence> <xsd:attribute name="orderDate" type="xsd:string" use="optional"/> </xsd:complexType> </xsd:element> <xsd:element name="quantity" type="xsd:string"/> <xsd:element name="shipDate" type="xsd:string"/>

Copyright IBM Corporation 2004

Figure 7-27. A Simple XML Document - XSD Part 3 (1 of 2)

XM3014.1

Notes:
The first element is the last element from several pages ago. It wouldn't make sense to just copy the last line because we would lack the context it is in. By now you have noticed that element is used quite often in creating a schema. Hence the need to provide enough lines to be able to identify which element in the XML instance we mean. Notice, too, the relationship of these elements to the main element <xsd:schema...>

7-32 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

A Simple XML Document - XSD Part 3 (2 of 2)


The first thing to note is the definitions of the elements previously defined as ref= name productName
Studio is now able to define the main (root) element purchaseOrder Based on the purchase order instance, we expect to see the elements
shipTo billTo comment items

We also expect to see the attribute orderDate items, as we have seen, had its own definition. The remainder of the schema provide the definitions of elements referred to previously The last two lines define the elements
quantity shipDate
Copyright IBM Corporation 2004

Figure 7-28. A Simple XML Document - XSD Part 3 (2 of 2)

XM3014.1

Notes:
We still need to validate the type choices. We will also want to arrange the definitions in an order more meaningful to us.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-33

Student Notebook

A Simple XML Document - XSD Part 4 (1 of 2)


<xsd:element name="shipDate" type="xsd:string"/> <xsd:element name="shipTo"> <xsd:complexType> <xsd:sequence> <xsd:element ref="name"/> <xsd:element ref="street"/> <xsd:element ref="city"/> <xsd:element ref="state"/> <xsd:element ref="zip"/> </xsd:sequence> <xsd:attribute name="country" type="xsd:string" use="optional"/> </xsd:complexType> </xsd:element> <xsd:element name="state" type="xsd:string"/> <xsd:element name="street" type="xsd:string"/> <xsd:element name="street2" type="xsd:string"/> <xsd:element name="zip" type="xsd:string"/> </xsd:schema>

Copyright IBM Corporation 2004

Figure 7-29. A Simple XML Document - XSD Part 4 (1 of 2)

XM3014.1

Notes:
Again, the first line here is the last line from several pages ago.

7-34 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

A Simple XML Document - XSD Part 4 (2 of 2)


billTo was defined much earlier; shipTo has now also been defined Studio again chose to use ref= for the actual definitions
Name Street City State Zip

Name and city were previously defined The remaining elements provide definitions/declarations for the remaining referenced elements Studio uses string for all these definitions Clearly, there is room for our inputs and those of our domain experts. The schema file poBetter.xsd in Unit 7 of XM301 Lectures contains our views of how it should be organized and the simple element types defined All the references are organized at the top of the file. What we see are the benefits of our knowing the big picture.
Copyright IBM Corporation 2004

Figure 7-30. A Simple XML Document - XSD Part 4 (2 of 2)

XM3014.1

Notes:
Another way of looking at is we have the benefit of seeing the instance document in its totality: we can apply global optimization. If this were a huge instance document we could also only be looking at a small part of the total picture. In that case, a CASE tool might be able to do better than we can. Except we may have access to the thought process / specification used to create this document.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-35

Student Notebook

A Simple XML Document Connecting the Schema to the Instance


We will explain the philosophy behind how we make .xml files aware of the schema we wish to apply to them in later charts. Here are the mechanics of how we do this using Studio: Select the po.xml file in Navigator Right-click and select Assign from the context-sensitive menu that opens (it is near the bottom of the very long list) Select XML Schema. . . to open the New Namespace Information shown below Since the XML instance used neither a namespace nor a prefix, leave these blank; browse down to the po.xsd and select it Click OK and Studio will add the appropriate information to po.xml

Copyright IBM Corporation 2004

Figure 7-31. A Simple XML Document - Connecting the Schema to the Instance

XM3014.1

Notes:
You will find this process necessary should you wish to test your own files before we have a chance to describe the theory behind the assignment process. Studio -> Help Search "assign xml schema" will take you to the topic "Assigning an XML file to an XML schema." You will find complete instructions there; the key is to follow the directions literally: our XML file did not use a "namespace" so there is also no "prefix." It is easy to mistake the xsd prefix in the po.xsd file for the "Prefix" requested by the pop-up window. That is incorrect: the "Namespace" and "Prefix" described in the pop-up window refer to the po.xml file. Most of the time, your .xml file will use a namespace and a prefix; in our effort to created a "simple" example, we may have made it inadvertently complex! The additions - inside the .xml file are: xsi:noNamespaceSchemaLocation="po.xsd" and xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" into what was <purchaseOrder orderDate="2003-10-20">
7-36 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

The vertical line in the "Namespace Name:" fill-in is the cursor that was visible when the screenshot was made.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-37

Student Notebook

What's Next?
That's it for an introduction to XML Schema via a simple example There is still a lot of additional specificity to be added to the poBetter.xsd file before we would want to use it in the real world; for examples
How many street/street2 entries should there be? Can we define some expression to force telephone numbers to have an area code, an exchange, and a four digit number? . . .others???

Our next step is to introduce the syntax and nomenclature for the majority of constructs you will find useful in actually creating and using schemas in most common situations Again, for emphasis, the W3C Web site is the only normative source There are two normative parts: http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/ And one non-normative primer: http://www.w3.org/TR/2001/REC-xmlschema-0-20010502/
Copyright IBM Corporation 2004

Figure 7-32. What's Next?

XM3014.1

Notes:
There are many ways of writing a valid schema. The primer on the W3C website provides a considerably different schema from the one we have shown you. The XSD language is as complex as any spoken or written language. ...and like any other language, it is not necessary to know everything about a language in order to use it to communicate. Most of the more advanced concepts will become part of your vocabulary through necessity and practice. Normative implies compliance is required; it defines usage, for example. Non-normative implies that, while the usage is correct not all possibilities are covered, for example. There is also the URL for the requirements we quoted at the beginning of this part.

7-38 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

. . .but first. . .

Break!

Figure 7-33. . . .but first. . .

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-39

Student Notebook

XML Schema Part II


Introduce the semantics of the more common constructs Including the more common options Provide examples of how Studio will help We will continue the refinement of the introductory example as a practical example of how to employ the power of a schema

Copyright IBM Corporation 2004

Figure 7-34. XML Schema Part II

XM3014.1

Notes:
This chart repeats some information stated at the beginning of the unit. Again we caution you against expecting detailed treatment of these topics: Do not confuse common with basic: some quite advanced concepts are part of many constructs. The XSD language is extensive as is the English language; even though you may know the difference between an a positive and a noun you must realize the average person does not. Similarly, it is not necessary to master every nuance of the XSD language in order to create useful XSD documents. In creating these notes we make every effort to use the language of the W3C specification. Studio can be of inestimable value to us on our journey to mastering XSD documents as you shall see both in the lecture and the accompanying lab. So relax, and let us proceed.
7-40 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Before We Begin: Some Notes about Studio


We have already seen evidence of the capabilities of Studio with regard to XML documents The same capabilities can be applied to XSD documents Plus additional capabilities are available Example: open poBetter.xsd and po.xml Notice po.xml offers the design and source views poBetter.xsd offers the souce and graph views When viewing XSD documents with Studio what you see is context sensitive That is, it depends on the placement of your cursor When adding new information it may be done in either view The menu format differs depending on which view you are in The result is the same. . .it becomes a matter of your preference
Studio will prompt you with a context-sensitive set of legitimate choices
Copyright IBM Corporation 2004

Figure 7-35. Before We Begin: Some Notes about Studio

XM3014.1

Notes:
We could spend hours trying different combinations in class. This is one of those instances where you are encouraged to explore on your own. Note, too, that some very complex concepts are part of many of the choices Studio suggests. We recommend you examine the examples provided here to see which constructs we typically need.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-41

Student Notebook

Complex Type Definitions, Element and Attribute Declarations


Two basic categories: First category contains types that we must distinguish
Complex types
Allow elements May carry attributes

Simple types
Do not allow elements May not carry attributes

Second category that requires distinction


Definitions that create new types
Both simple and complex

Declarations that enable elements and attributes to appear in document instances


Both simple and complex

Copyright IBM Corporation 2004

Figure 7-36. Complex Type Definitions, Element and Attribute Declarations

XM3014.1

Notes:
It's not as confusing as it sounds. Definitions allow us to define our own types that may apply to our unique situations. For example, things that differ from country-to-country such as phone number and postal codes: we can define a USPostalCode that demands ddddd-dddd, where d = digit and we could even make the last four digits optional; and we could define a CNPostalCode that would apply only to postal codes consistent with Canadian postal codes. Declarations, on the other hand, define what is legal in the document instances; as you can guess, definitions play a key role in defining the declarations to apply!

7-42 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Parts of XSD Speech (1 of 2)


Schema component is the generic term for the building blocks that comprise the abstract data model of the schema. An XML Schema is a set of schema components. There are 13 kinds of schema components, falling into three groups: Primary components, which may (type definitions) or must (element declarations and attribute declarations) have names are as follows:
1.1 Simple type definitions 1.2 Complex type definitions 1.3 Element declarations 1.4 Attribute declarations

Secondary components, which must have names, are as follows:


2.1 Attribute group definitions 2.2 Identity-constraint definitions 2.3 Model group definitions 2.4 Notation declarations

"Helper" components . . .continued. . .

Copyright IBM Corporation 2004

Figure 7-37. Parts of XSD Speech (1 of 2)

XM3014.1

Notes:
We will use this structure to introduce the details of the 13 kinds of schema component. This will be slightly confusing because different perspectives are best dealt with using different organizations. This is a classic example of the difficulty with a language: there is no beginning and no end -- only middle!

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-43

Student Notebook

Parts of XSD Speech (2 of 2)


Helper components provide small parts of other components; they are not independent of their context:
3.1 Annotations 3.2 Model groups 3.3 Particles 3.4 Wildcards 3.5 Attribute Uses

Beware: schema component may be treated like a reserved word in that it may refer to an element in a .xsd file that is a child of schema used as an element (specifically, what we have previously called the root element). In other contexts it may refer to any of the things that can appear in a schema document.

Copyright IBM Corporation 2004

Figure 7-38. Parts of XSD Speech (2 of 2)

XM3014.1

Notes:

7-44 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Resetting Expectations
The XML Schema Specification represents several hundred pages Every one is important for some situation Not every one will be encountered in routine operations This is an "Overview" course We will describe common usages here Additional, less common constructs are in the Appendix Refer to the three-part official specification for details. The chart that follows is an example Only the 1st of the 3 charts is presented in this unit The remaining two, specification level charts have been moved to the Appendix

Copyright IBM Corporation 2004

Figure 7-39. Resetting Expectations

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-45

Student Notebook

1.1 Simple Type (simpleType) Definition


Simple type definitions provide for constraining character information item [children] of element and attribute information items. Example. The XML representation of a simple type definition:
<xsd:simpleType name="farenheitWaterTemp"> <xsd:restriction base="xsd:number"> <xsd:fractionDigits value="2"/> <xsd:minExclusive value="0.00"/> <xsd:maxExclusive value="100.01"/> </xsd:restriction> </xsd:simpleType>

Copyright IBM Corporation 2004

Figure 7-40. 1.1 Simple Type (simpleType) Definition

XM3014.1

Notes:
For additional information: http://www.w3.org/TR/xmlschema-1/#Simple_Type_Definitions

7-46 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

All the Built-in Simple Types


Primitive Types string, boolean, decimal, float, double, duration, dateTime, time, date, gYearMonth, gYear, gMonthDay, gDay, gMonth, hexBinary, base64Binary, any URI, QName, NOTATION Derived Types (and base type) normalizedString (string), language (token), token (normalizedString), NMTOKEN (token), Name (token), NCName (Name), ID (NCName), IDREF (NCName), IDREFS (list of IDREF), ENTITY (NCName), ENTITIES (list of ENTITY), integer (decimal), nonPositiveInteger (intger), negativeInteger (nonPositiveInteger), long (integer), int (long), short (int), byte (short), nonNegativeInteger (integer), unsignedLong (nonNegativeInteger), unsignedInt (unsignedLong), unsignedShort (unsignedInt), unsignedByte (unsignedShort), positiveInteger (nonNegativeInteger) There is another list in the first part of this lecture.

Copyright IBM Corporation 2004

Figure 7-41. All the Built-in Simple Types

XM3014.1

Notes:
This foil shows all the built-in datatypes in XML Schema. They are divided into two categories: primitive types, and types which are derived from those primitive types Let's look at the primitive types first. The blue types are the types required for the XML 1.0 specification. Numeric types are shown in red, and time/date types are shown in green. In the first row we have string, and three numeric types. Decimal is an arbitrary precision floating point number. Float and Double correspond to the IEEE floating point types of the same name. In the next row there is a Boolean type, and the timeDuration and recurringDuration types which form the base for all time and date related types. The third row covers types from XML 1.0 as well as the QNAME type for the XML Namespaces recommendation. In the fourth row, we see uriReference, which corresponds to URI's and binary, which corresponds to encoded binary data.
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-47

Student Notebook

Now lets look at the derived types. The first two rows of the table show types from XML1.0 and the Namespaces recommendation. In the third row, tokens are tokenized string, and language must be language identifier string from the XML 1.0 (Second Edition) Recommendation. The next four rows include a variety of convenient numeric types that are restrictions of one of the primitive numeric types. Likewise, the next three rows are convenient date and time types which are restrictions of the primitive date/time types.

7-48 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Creating New Simple Types


New simple types may be defined from built-in types, adding constraints via facets. Accomplished by: Restriction Extension Examples: A string that has a minimum and maximum length. An integer that has minimum and maximum values. A string with an enumerated list of allowed values. A type based on patterns.

Copyright IBM Corporation 2004

Figure 7-42. Creating New Simple Types

XM3014.1

Notes:
New simple types can be created by constraining an existing simple type.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-49

Student Notebook

Facets
Facets characterize properties of a simpleTypes' value space or lexical space. Value Space - abstract set of values in the type Lexical Space - concrete set of literals that you can write down Fundamental Facets Equal, ordered, bounded, cardinality, numeric Constraining (non-fundamental) Facets Length, minLength, maxLength Pattern, enumeration WhiteSpace maxInclusive, maxExclusive, minInclusive, minExclusive totalDigits, fractionDigits

Copyright IBM Corporation 2004

Figure 7-43. Facets

XM3014.1

Notes:
Facets characterize properties of a simple type's value space or lexical space. The value space of a simple type contains the values that the type represents (the set of rationals between 0 and 1 is a value space. Simple types have a lexical space, which is a set of literals that comprise the written/printed representation of the type (an example of a lexical space is base 16). Facets are like attributes on a type. They always exist for a type but may not have been set to any specific value. The facets, which can be set as part of the definition of a type, are listed below. Fundamental Facets (these facets are abstract and are there for completeness in modeling datatypes. Their values cannot be changed). equal, ordered, bounded, cardinality, numeric Constraining Facets (you can specify values for these facets to constrain an existing type).

7-50 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

length, minLength, maxLength (these apply to types where the notion of length is meaningful, such as string and binary) pattern, enumeration (these apply to all data types - pattern constrains the lexical space). whiteSpace (controls whitespace processing behavior, is only changeable for string ). maxInclusive, maxExclusive, minInclusive, minExclusive (applies to all the numeric and date/time types) totalDigits, fractionDigits (apply to decimal and it's derived types - total digits specifies total number of digits, fractionDigits specifies number of digits in the fractional part of a decimal (derived) type.)

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-51

Student Notebook

Example: Constraining Facets


Declaration:
<xsd:simpleType name="quantityType"> <xsd:restriction base="xsd:integer"> Notice the separate <xsd:minInclusive value="2"/> declaration of the new <xsd:maxInclusive value="5"/> simpleType named quantityType </xsd:restriction> </xsd:simpleType> <xsd:element name='quantity' type="quantityType"/>

Valid XML fragment:


<quantity>4</quantity>

Invalid XML fragments:


<quantity>-6</quantity> <quantity>many</quantity>

The new type quantityType is now used to define the type for the element quantity

Copyright IBM Corporation 2004

Figure 7-44. Example: Constraining Facets

XM3014.1

Notes:
In this schema, we show <simpleType> tag in addition to an <element> tag. This tag defines a new simple type called 'quantityType'. 'QuantityType' is a restriction of the built in type 'integer'. Integer is the value of the base attribute on the <restriction> sub tag of the <simpleType> tag. The base attribute specifies the base type for the restriction. Inside the <restriction> tag is the tag <minInclusive>. This means that 'quantityType' is giving a new value for the minInclusive facet of 'integer'. 'QuantityType' can have a minimum value of 0, up to the maximum value for integers. The simple type being restricted can either be a built in type or a previously defined simple type. The <element> tag declares a new element called 'quantity' whose content must match the rules for 'quantityType'.

7-52 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Example: Enumeration Facet


<xsd:element name="color"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:enumeration value="red"/> <xsd:enumeration value="green"/> <xsd:enumeration value="blue"/> </xsd:restriction> </xsd:simpleType> </xsd:element>

Valid XML:
<color>red</color>

Invalid XML:
<color>mauve</color> <color>10</color>

Copyright IBM Corporation 2004

Figure 7-45. Example: Enumeration Facet

XM3014.1

Notes:
All simple types have the enumeration facet. The enumeration facet allows the definer of a simple type to write out exactly which values of the base type are allowed in the restriction. In this example, the restriction has enumerated a set of color names. The value attribute of the enumeration tag specifies the value to be included in the enumeration.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-53

Student Notebook

simpleContent and Empty Complex Types


Simple Declare using <xsd:simpleContent> and extension/restriction:
<xsd:complexType name="simpleContentType"> <xsd:simpleContent> <xsd:extension base="xsd:..."> ... </xsd:extension> </xsd:simpleContent> </xsd:complexType>

Empty Declare by just writing attributes:


<xsd:complexType name="emptyType"> <xsd:attribute name="flag" type="xsd:boolean"/> </xsd:complexType>

Copyright IBM Corporation 2004

Figure 7-46. simpleContent and Empty Complex Types

XM3014.1

Notes:
A complex type needs to indicate what kind of content it is going to contain. This description is called a content model. The content model only describes the content of the complexType, but the complex type may include other information such as attribute definitions. There are four kinds of content models available in XML Schema. Here are the four types, along with the method for specifying them in a complexType definition. The Simple Content model contains typed character data. It is always used by extending some base simpleType. The empty content model means that no content at all is allowed. Usually elements with an empty content model carry their data as attributes.

7-54 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

simpleContent Example
Declaration:
<xsd:element name="quantity"> <xsd:complexType> <xsd:simpleContent> <xsd:extension base="xsd:nonNegativeInteger"> <xsd:attribute name="backorderable" type="xsd:boolean" default="false"/> </xsd:extension> </xsd:simpleContent> </xsd:complexType> </xsd:element>

Valid XML fragment: <quantity backorderable="true">1</quantity> Invalid XML fragments: <quantity orderable="true">2</quantity> The extension is the addition of the attribute.
Copyright IBM Corporation 2004

Figure 7-47. simpleContent Example

XM3014.1

Notes:
This visual shows the definition of a complex type. It only uses the most basic features of complex types. The schema declares an element called quantity. The definition of the complex type for quantity is embedded in the declaration of quantity. In this case, the complex type has the simpleContent content model, which says that the complex type only allows character data to be present in the element content. That simpleContent is based on the simpleType 'nonNegativeInteger', and is going to be extended (we are going to add to the type -- either a facet or attribute). We could also have a restriction (remove something). The simpleContent content model is extending 'nonNegativeInteger' by adding an attribute named 'backorderable' to the content model. There is no other extension being performed in this example. In the coming visuals we are going to introduce some concepts that we need in order to show more complicated complex type definitions.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-55

Student Notebook

1.2 Complex Type Definition (1 of 2)


Complex Type Definitions provide for: Constraining element information items by providing Attribute Declaration (2.2.2.3)s governing the appearance and content of [attributes] Constraining element information item [children] to be empty, or to conform to a specified element-only or mixed content model, or else constraining the character information item [children] to conform to a specified simple type definition. Using the mechanisms of Type Definition Hierarchy (2.2.1.1) to derive a complex type from another simple or complex type. Specifying post-schema-validation infoset contributions for elements. Limiting the ability to derive additional types from a given complex type. Controlling the permission to substitute, in an instance, elements of a derived type for elements declared in a content model to be of a given complex type. An example follows:
Copyright IBM Corporation 2004

Figure 7-48. 1.2 Complex Type Definition (1 of 2)

XM3014.1

Notes:
For additional information: http://www.w3.org/TR/xmlschema-1/#Complex_Type_Definitions As you can perceive, complex type definitions will provide a primary means of controlling the content of XML instances.

7-56 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

1.2 Complex Type Definition (2 of 2)


Example. The XML representation of a complex type definition.
<xsd:complexType name="PurchaseOrderType"> <xsd:sequence> <xsd:element name="shipTo" type="USAddress"/> <xsd:element name="billTo" type="USAddress"/> <xsd:element ref="comment" minOccurs="0"/> <xsd:element name="items" type="Items"/> </xsd:sequence> <xsd:attribute name="orderDate" type="xs:date"/> </xsd:complexType>

The example above includes a name (PurchaseOrderType); it is also possible to define an anonymous complexType by dropping the "name='PurchaseOrderType'" and including the result as a child of an element:
<xsd:element name="PurchaseOrderType"> ... </xsd:element>

Copyright IBM Corporation 2004

Figure 7-49. 1.2 Complex Type Definition (2 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-57

Student Notebook

Named versus Anonymous Types (1 of 2)


When the definition of a new type (Complex or Simple) is nested inside an element declaration it is called an Anonymous type These are also known as inline definitions <xsd:element name='color'> <xsd:simpleType> <xsd:restriction base='xsd:string'> <xsd:enumeration value='red'/> <xsd:enumeration value='green'/> <xsd:enumeration value='blue'/> </xsd:restriction> </xsd:simpleType> </xsd:element>

This type has no 'name'. It is Anonymous!

Anonymous types are not reusable in other parts of the Schema.

Copyright IBM Corporation 2004

Figure 7-50. Named versus Anonymous Types (1 of 2)

XM3014.1

Notes:
We're now ready to officially look at element declarations. There are two declarations on this foil. Element declaration with simple type that occurs exactly once. The first declaration is for an element called quantity whose type is the simple type nonNegativeInteger and which is allowed to appear exactly once. We'll talk more about the minOccurs and maxOccurs attributes in a few minutes. Element declaration with local simpleType definition. The second declaration is for an element called quantity whose type is a local simple type with the names of three colors as its values.

7-58 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Named versus Anonymous Types (2 of 2)


This accomplishes the same thing but it's reusable. <xsd:simpleType name="colorType"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="red"/> <xsd:enumeration value="green"/> <xsd:enumeration value="blue"/> </xsd:restriction> </xsd:simpleType> <xsd:element name="color" type="colorType"/>

Named types ARE reusable in other parts of the Schema, provided that they are declared as direct children of the schema element.

Copyright IBM Corporation 2004

Figure 7-51. Named versus Anonymous Types (2 of 2)

XM3014.1

Notes:
We're now ready to officially look at element declarations. There are two declarations on this foil. Element declaration with simple type that occurs exactly once. The first declaration is for an element called quantity whose type is the simple type nonNegativeInteger and which is allowed to appear exactly once. We'll talk more about the minOccurs and maxOccurs attributes in a few minutes. Element declaration with local simpleType definition. The second declaration is for an element called quantity whose type is a local simple type with the names of three colors as its values.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-59

Student Notebook

Declaring Child Elements in complexType Elements


Child elements are declared by using a 'compositor' in an element declaration.
<xsd:element name=... type='typeName'> <xsd:complexType> <xsd:someCompositor> <xsd:element.../> <xsd:element.../> </xsd:someCompositor> </xsd:complexType> </xsd:element>
Elements declaring a complexType cannot have a type attribute The use of a complexType element is required. More on this in a moment

There are three different compositors: xsd:sequence: the elements may occur only in the order specified according to their min/maxOccurs values xsd:choice: only one of the elements declared may occur but its min/maxOccurs values dictate how many times it may occur (very hard to do in a DTD) xsd:all: the elements must all occur (in accordance with their min/maxOccurs values) but order doesn't matter (also very hard to do in DTD) Copyright IBM Corporation 2004
Figure 7-52. Declaring Child Elements in complexType Elements XM3014.1

Notes:

7-60 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Element Declaration: Common Usage


Use this declaration format for situations where a type definition exists (built-in or otherwise) that suits the nature of the element your are declaring. Here's a declaration for an element with a predefined type. It is declared to occur exactly once:
<xsd:element name='quantity' type='xsd:nonNegativeInteger' minOccurs='1' maxOccurs='1'/>

minOccurs and maxOccurs are used to indicate the number of elements required/permitted. Note: The typename must be namespace qualified. If the type is not associated with the namespace of the schema element it must be qualified with a namespace prefix representing the proper namespace.
Copyright IBM Corporation 2004

Figure 7-53. Element Declaration: Common Usage

XM3014.1

Notes:
The declaration is for an element called quantity whose type is the simple type nonNegativeInteger and which is allowed to appear exactly once. We'll talk more about the minOccurs and maxOccurs attributes in a few minutes. If we haven't already so stated, the xsd: prefix is what tells the processor where to look for validation information. Tying it to the W3C 2001 XML Schema makes that specification normative for the purposes of validation. Remember, the schema element is the root or main element.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-61

Student Notebook

minOccurs and maxOccurs


minOccurs attribute specifies the minimum number of times that a component may occur. maxOccurs attribute specifies the maximum number of times that a component may occur.

Components that can have a minOccurs or maxOccurs attribute are: Elements. Groups. xsd:all, xsd:sequence, and xsd:choice compositors. Wildcards. Both values default to 1. It is an error if minOccurs is greater that maxOccurs, that is, maxOccurs must always be greater than or equal to minOccurs. maxOccurs may have the non-numeric value unbounded (an infinite number).
Copyright IBM Corporation 2004

Figure 7-54. minOccurs and maxOccurs

XM3014.1

Notes:
XML Schema provides the minOccurs and maxOccurs attributes as a way of controlling the number of times that a particular component may appear. The minOccurs attribute specifies the minimum number of times that a component may occur. If the value of minOccurs is greater than zero, then the component is required. If the value of minOccurs is zero, then the component is optional. The default value for minOccurs is 1. The maxOccurs attribute specifies the maximum number of times that a component may occur. Their special value 'unbounded' means that the component may appear an unlimited number of times. This corresponds to the use of the asterisk in DTD's. The default value for maxOccurs is 1. minOccurs and maxOccurs apply only to the component that has the minOccurs or maxOccurs attribute.

7-62 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Example: minOccurs and maxOccurs


<xsd:element name='DNASample'> <xsd:complexType> <xsd:sequence> <xsd:element name='sample' type='dnaType' minOccurs='2' maxOccurs='500'/> </xsd:sequence> </xsd:complexType> </xsd:element> ...

A definition for dnaType is assumed to exist in this schema

Valid XML:
<DNASample> <sample>GATCTATC</sample> <sample>ATAAACG</sample> </DNASample>

Invalid XML:
<DNASample> <sample>ATGCAAT</sample> </DNASample>

Copyright IBM Corporation 2004

Figure 7-55. Example: minOccurs and maxOccurs

XM3014.1

Notes:
Here we define a version of DNASample that can take between 2 and 500 samples. We use minOccurs and maxOccurs to enforce this rule.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-63

Student Notebook

1.4 Attribute Declaration


Attribute declarations can appear at the top level of a schema document, or within complex type definitions, either as complete (local) declarations, or by reference to top-level declarations, or within attribute group definitions (TBD). For complete declarations, top-level or local, the type attribute is used when the declaration can use a built-in or pre-declared simple type definition. Otherwise an anonymous <simpleType> is provided inline. This is the general form: <attribute default = string fixed = string form = (qualified | unqualified) id = ID name = NCName ref = QName type = QName use = (optional | prohibited | required) : optional {any attributes with non-schema namespace . . .}> Content: (annotation?, (simpleType?)) </attribute>
Copyright IBM Corporation 2004

Figure 7-56. 1.4 Attribute Declaration

XM3014.1

Notes:

7-64 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Declaring Attributes
Attributes are declared as children of a complexType element.
<xsd:element name=...>

<xsd:complexType> <xsd:attribute name='attName' type='aSimpleType' fixed|default='value' use='...'/> </xsd:complexType>


</xsd:element>

Some useful (optional) attributes used in an xsd:attribute declaration are: Use: values can be required, prohibited and optional, optional is the default Default: provides a default value for the attribute when it is absent Fixed: fixes the attribute value to the value specified Some rules: The attribute type must be a simpleType (that is, non-element) Fixed and default may not be present together on xsd:attribute. Use attribute must be "optional" or absent when a default is provided.

Copyright IBM Corporation 2004

Figure 7-57. Declaring Attributes

XM3014.1

Notes:
Attributes are declared by placing an <attribute> element inside of the <complexType> element. The name attribute specifies the name of the attribute, the type attribute specifies its type. The use attribute tells whether the attribute is 'optional' or 'required'. The default attribute tells what the default value will be if the attribute is omitted in the XML document. Here are examples of common attribute declarations. Optional attribute with default. The first declaration shows how to declare an optional attribute whose value must be an integer. If the attribute does not appear in the instance document, it will be given a default value of 10.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-65

Student Notebook

Example: Attribute Declaration


<xsd:element name='blank'> <xsd:complexType> <xsd:attribute name='temperature' type='xsd:decimal' fixed='32.0'/> </xsd:complexType> </xsd:element>

Valid XML: <blank temperature='32.0'/> (preferred) <blank/> <!-- temperature='32.0' --> Invalid XML: <blank temperature='34.0'/>

Copyright IBM Corporation 2004

Figure 7-58. Example: Attribute Declaration

XM3014.1

Notes:
Optional fixed attribute (using default value for use). The fixed attribute specifies the fixed value that the attribute must have if a value is specified by a document. This declaration shows how to declare an optional attribute whose value must be an integer. If the attribute value does appear in the instance document, it must have the value 32.0.

7-66 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Example: An Element with Attributes (1 of 2)


Element Declaration for an element with subelements and attribute.

<xsd:element name="person"> <xsd:complexType> <xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="id" type="xsd:integer"/> </xsd:sequence> <xsd:attribute name="criminal" type="xsd:boolean" default="false"/> </xsd:complexType> </xsd:element> "boolean" can be true/false (case sensitive) or 1/0.

Copyright IBM Corporation 2004

Figure 7-59. Example: An Element with Attributes (1 of 2)

XM3014.1

Notes:
In this example we declare a complex Type called personType. personType has the Element only content model, as shown by the use of the <sequence> compositor inside the <complexType> element. We also declare a single attribute named 'criminal' as part of personType. Now we can declare an element named 'person' that will have the type 'personType'.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-67

Student Notebook

Example: An Element with Attributes (2 of 2)


Valid XML fragments:
<person criminal="true"> <name>John</name> <id>42</id> </person> <person> <name>Susan</name> <id>27</id> </person>

Invalid XML fragments:


<person> <id>27</id> <name>Susan</name> </person> <person friend="true"> <name>John</name> <id>42</id> </person>

Copyright IBM Corporation 2004

Figure 7-60. Example: An Element with Attributes (2 of 2)

XM3014.1

Notes:
The first invalid example is invalid because <id> and <name> are in the wrong order. The second invalid example is invalid because there's no attribute called 'friend'.

7-68 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

2.1 Attribute Group Definitions


A schema can name a group of attribute declarations so that they may be incorporated as a group into complex type definitions. Attribute group definitions do not participate in validation as such, but the {attribute uses} and {attribute wildcard} of one or more complex type definitions may be constructed in whole or part by reference to an attribute group. Thus, attribute group definitions provide a replacement for some uses of XML's parameter entity facility. Attribute group definitions are provided primarily for reference from the XML representation of schema components (see <complexType> and <attributeGroup>). Example, next page. . .

Copyright IBM Corporation 2004

Figure 7-61. 2.1 Attribute Group Definitions

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-69

Student Notebook

Anonymous Types in Attribute Declarations


Anonymous types can also be used in attribute declarations.
<xsd:element name="blank"> <xsd:complexType> <xsd:attribute name="count" use="required"> <xsd:simpleType> <xsd:restriction base="xsd:integer"> <xsd:minInclusive value="0"/> </xsd:restriction> </xsd:simpleType> </xsd:attribute> </xsd:complexType> </xsd:element>

Valid : <blank count="1"/> Invalid XML: <blank/> <blank count="-1"/>


Copyright IBM Corporation 2004

This type has no 'name'. It is Anonymous!

Figure 7-62. Anonymous Types in Attribute Declarations

XM3014.1

Notes:
Required attribute with local type definition. There is no type attribute for this <attribute> element because the simple type definition is embedded in the attribute declaration. This declaration shows how to declare a required attribute. This declaration also includes an embedded simple type definition that specifies a restriction of the integers to integers greater than or equal to zero.

7-70 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

2.1 Attribute Group Definitions


Example: <xs:attributeGroup name="myAttrGroup"> <xs:attribute . . ./> ... </xs:attributeGroup> <xs:complexType name="myelement"> ... <xs:attributeGroup ref="myAttrGroup"/> </xs:complexType> XML representations for attribute group definitions. The effect is as if the attribute declarations in the group were present in the type definition.

Copyright IBM Corporation 2004

Figure 7-63. 2.1 Attribute Group Definitions

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-71

Student Notebook

Attribute Groups
If a group of attributes are used together often, an attribute group can be created to formalize the relationship and avoid the need to declare the same attributes in several places.
<xsd:attributeGroup name="addressInfo"> <xsd:attribute name="street" type="xsd:string" use="required"/> <xsd:attribute name="city" type="xsd:string" use="required"/> <xsd:attribute name="state" type="xsd:string" use="required"> <xsd:attribute name="zip" type="xsd:string" use="required"> </xsd:attribute>

<xsd:element name="mailingAddress"> <xsd:complexType>


<xsd:attributeGroup ref="addressInfo"/> </xsd:complexType> </xsd:element>

<xsd:element name='homeAddress'> <xsd:complexType>


<xsd:attributeGroup ref="addressInfo"/> </xsd:complexType> </xsd:element>
Copyright IBM Corporation 2004

Figure 7-64. Attribute Groups

XM3014.1

Notes:

7-72 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

2.3 Model Group Definitions (1 of 2)


A model group definition associates a name and optional annotations with a Model Group (2.2.3.1). By reference to the name, the entire model group can be incorporated by reference into a {term}. Model group definitions are provided primarily for reference from the XML Representation of Complex Type Definitions (3.4.2) (see <complexType> and <group>). Thus, model group definitions provide a replacement for some uses of XML's parameter entity facility. Example (next page). . .

Copyright IBM Corporation 2004

Figure 7-65. 2.3 Model Group Definitions (1 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-73

Student Notebook

2.3 Model Group Definitions (2 of 2)


Example
<xs:group name="myModelGroup"> <xs:sequence> <xs:element ref="someThing"/> ... </xs:sequence> </xs:group> <xs:complexType name="trivial"> <xs:group ref="myModelGroup"/> <xs:attribute .../> <xs:complexType name="moreSo"> <xs:choice> <xs:element ref="anotherThing"/> <xs:group ref="myModelGroup"/> </xs:choice> <xs:attribute .../> </xs:complexType>

A minimal model group is defined and used by reference, first as the whole content model, then as one alternative in a choice.
Copyright IBM Corporation 2004

Figure 7-66. 2.3 Model Group Definitions (2 of 2)

XM3014.1

Notes:

7-74 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

2.4 Notation Declarations (1 of 2)


Notation declarations reconstruct XML 1.0 NOTATION declarations. Example. The XML representation of a notation declaration.
<xs:notation name="jpeg" public="image/jpeg" system="viewer.exe">

A second, longer example follows. . .

Copyright IBM Corporation 2004

Figure 7-67. 2.4 Notation Declarations (1 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-75

Student Notebook

2.4 Notation Declarations (2 of 2)


Example 2.
<xsd:notation name="jpeg" public="image/jpeg" system="viewer.exe" /> <xsd:element name="picture"> <xsd:complexType> <xsd:simpleContent> <xsd:extension base="xsd:hexBinary"> <xsd:attribute name="pictype"> <xsd:simpleType> <xsd:restriction base="xsd:NOTATION"> <xsd:enumeration value="jpeg"/> <xsd:enumeration value="png"/> ... </xsd:restriction> </xsd:simpleType> </xsd:attribute> </xsd:extension> </xsd:simpleContent> </xsd:complexType> </xsd:element> <picture pictype="jpeg">...</picture>
Copyright IBM Corporation 2004

Figure 7-68. 2.4 Notation Declarations (2 of 2)

XM3014.1

Notes:

7-76 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

3.1 Annotations
Annotations provide for human- and machine-targeted annotations of schema components. Example. XML representations of three kinds of annotation:
<xsd:simpleType fn:note="special"> <xsd:annotation> <xsd:documentation>A type for experts only</xsd:documentation> <xsd:appinfo> <fn:specialHandling>checkForPrimes</fn:specialHandling> </xsd:appinfo> </xsd:annotation> ... </xsd:simpleType>

Using the wizard in Studio can make using annotation much more user-friendly.

Copyright IBM Corporation 2004

Figure 7-69. 3.1 Annotations

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-77

Student Notebook

3.2 Model Groups (1 of 2)


When the children of element information items are not constrained to be empty or by reference to a simple type definition, the sequence of element information item [children] content may be specified in more detail with a model group. Because the term property of a particle can be a model group, and model groups contain particles, model groups can indirectly contain other model groups; the grammar for content models is therefore recursive. A model group is a constraint in the form of a grammar fragment that applies to lists of element information items. It consists of a list of particles, that is, element declarations, wildcards and model groups. There are three varieties of model group: Sequence (the element information items match the particles in sequential order); Conjunction (the element information items match the particles, in any order); Disjunction (the element information items match one of the particles).
Copyright IBM Corporation 2004

Figure 7-70. 3.2 Model Groups (1 of 2)

XM3014.1

Notes:

7-78 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

3.2 Model Groups (2 of 2)


Example. XML representations for the three kinds of model group (all, sequence, choice), the third nested inside the second.
<xsd:all> <xsd:element ref="cats"/> <xsd:element ref="dogs"/> </xsd:all> <xsd:sequence> <xsd:choice> <xsd:element ref="left"/> <xsd:element ref="right"/> </xsd:choice> <xsd:element ref="landmark"/> </xsd:sequence>

Copyright IBM Corporation 2004

Figure 7-71. 3.2 Model Groups (2 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-79

Student Notebook

Example: Compositors (Model Groups)


firstname and lastname in the declared order
<xsd:sequence> <xsd:element name='firstName' type='xsd:string'/> <xsd:element name='lastName' type='xsd:string'/> </xsd:sequence>

maidenName or cityOfBirth but not both


<xsd:choice> <xsd:element name='maidenName' type='xsd:string'/> <xsd:element name='cityOfBirth' type='xsd:string'/> </xsd:choice>

height and weight in any order


<xsd:all> <xsd:element name='height' type='xsd:float'/> <xsd:element name='weight' type='xsd:float'/> </xsd:all>

Copyright IBM Corporation 2004

Figure 7-72. Example: Compositors (Model Groups)

XM3014.1

Notes:
Model groups allow you to group a set of element declarations together and use them. In multiple places. There are three kinds of model groups which are distinguished be a compositor element, as illustrated on this foil. Model Group with sequence compositor The most straightforward kind of model group uses the sequence compositor, which simply states that all the elements must appear in the same order as the element declarations in the model group. Model Group with choice compositor Also straightforward is the choice compositor, which states that any element matching one of the element declarations in the group may appear in the instance document. The sequence and choice model groups appear in XML 1.0 DTDs.

7-80 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Model Group with all compositor New in XML Schema is the all compositor, which says that all the elements specified in the group must appear in the instance document, but the elements are allowed to appear in any order. The value of compositor was expressed earlier as varieties. The term occurs in Part 1 of the 2001 XML Specification.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-81

Student Notebook

Model Groups and Compositors


Model groups allow you to group a set of element declarations together and use them in multiple places. Model Group with sequence compositor: <xsd:group name='orderedName'> <xsd:sequence> <xsd:element name="firstName" type="xsd:string"/> <xsd:element name="lastName" type="xsd:string"/> </xsd:sequence> </xsd:group> Model Group with choice compositor: <xsd:group name='securityQuestion'> <xsd:choice> <xsd:element name="maidenName" type="xsd:string"/> <xsd:element name="cityOfBirth" type="xsd:string"/> </xsd:choice> </xsd:group> Model Group with all compositor: <xsd:group name="heightAndWeight"> <xsd:all> <xsd:element name="height" type="xsd:float"/> <xsd:element name="weight" type="xsd:float"/> </xsd:all> </xsd:group>
Copyright IBM Corporation 2004

Figure 7-73. Model Groups and Compositors

XM3014.1

Notes:
Model groups allow you to group a set of element declarations together and use them. In multiple places. There are three kinds of model groups which are distinguished be a compositor element, as illustrated on this foil. Model Group with sequence compositor The most straightforward kind of model group uses the sequence compositor, which simply states that all the elements must appear in the same order as the element declarations in the model group. Model Group with choice compositor Also straightforward is the choice compositor, which states that any element matching one of the element declarations in the group may appear in the instance document. The sequence and choice model groups appear in XML 1.0 DTDs.

7-82 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Model Group with all compositor New in XML Schema is the all compositor, which says that all the elements specified in the group must appear in the instance document, but the elements are allowed to appear in any order.

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-83

Student Notebook

Example: Global Definitions and Declarations


An element declaration that uses a global complexType, which itself uses a global model group:
<xsd:group name="vitals"> <xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="id" type="xsd:integer"/> </xsd:sequence> </xsd:group> <xsd:complexType name="personType"> <xsd:group ref="vitals"/> <xsd:attribute name="criminal" type="xsd:boolean" use="optional"/> </xsd:complexType> <xsd:element name="person" type="personType"/>

Copyright IBM Corporation 2004

Figure 7-74. Example: Global Definitions and Declarations

XM3014.1

Notes:
Element Declaration that uses a global complexType, which itself uses a global model group. On this foil we are building up a bunch of components to be used by the element declaration at the bottom of the foil. First we define a model group called 'vitals', which contains 2 element declarations. This group uses a sequence compositor. Next we define the complexType called 'personType', which references the model group 'vitals' to pick up it's content type. It also adds an optional Boolean attribute. Finally, we declare the element 'person' and have it reference the global complexType 'personType' that we declared earlier.

7-84 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

3.5 Attribute Uses


An attribute use is a utility component which controls the occurrence and defaulting behavior of attribute declarations. It plays the same role for attribute declarations in complex types that particles play for element declarations. Example. XML representations which all involve attribute uses, illustrating some of the possibilities for controlling occurrence.
<xs:complexType> ... <xs:attribute ref="xml:lang" use="required"/> <xs:attribute ref="xml:space" default="preserve"/> <xs:attribute name="version" type="xs:number" fixed="1.0"/> </xs:complexType>

Copyright IBM Corporation 2004

Figure 7-75. 3.5 Attribute Uses

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-85

Student Notebook

Part III Associating a .xsd with a .xml


This concludes our high level discussion of schema components ! It remains to describe how the XML instance and an XSD vocabulary are made aware of each other. But first . . .

Copyright IBM Corporation 2004

Figure 7-76. Part III Associating a .xsd with a .xml

XM3014.1

Notes:

7-86 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

. . .but first. . .

Break!

Figure 7-77. . . .but first. . .

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-87

Student Notebook

Namespaces, Schemas and Qualification


Recall the generic pattern for a schema:
<schema attributeFormDefault = (qualified | unqualified) : unqualified blockDefault = (#all | List of (extension | restriction | substitution)) :' ' elementFormDefault = (qualified | unqualified) : unqualified finalDefault = (#all | List of (extension | restriction)) : ' ' id = ID targetNamespace = anyURI version = token xml:lang = language {any attributes with non-schema namespace . . .}> Content: ((include | import | redefine | annotation)*, (((simpleType | complexType | group | attributeGroup) | element | attribute | notation), annotation*)*) </schema>

The first four attributes + targetNamespace are a few of the 34 possible attributes reserved specifically for the XML Schema element itself. Note that both the attributeFormDefault and the elementFormDefault are unqualified by default.
Copyright IBM Corporation 2004

Figure 7-78. Namespaces, Schemas and Qualification

XM3014.1

Notes:
The impact of the ...Default(s) being unqualified is that we do not have to include prefixes on elements and/or attributes. That is, unless there is some reason so to do.

7-88 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Namespaces, Schemas and Qualification


A schema may be visualized as a collection or vocabulary of Type definitions and element declarations . . .whose names belong to a particular namespace. . . Called a target namespace. It is permitted to us to create our own vocabulary The XML schema vocabulary is part of the http://www.w3.org/2001/XMLSchema target namespace; Ours would be some other, appropriate URI In order to validate that an instance document conforms to one or more schemas we need to identify Which element and attribute declarations and type definitions in the schemas should be used to check Which elements and attributes in the instance document. The schema author can also determine whether the elements and attributes in the instance need to be namespace qualified.

Copyright IBM Corporation 2004

Figure 7-79. Namespaces, Schemas and Qualification

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-89

Student Notebook

Putting a Schema in a Namespace


How do I associate a set of components with a particular namespace? The targetNamespace attribute.
<xsd:schema xmlns="http://www.ibm.com/Schemas/WD03/target" xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.ibm.com/Schemas/WD03/target" > <xsd:element name="quantity" type="xsd:integer"/> </xsd:schema>

Now the <quantity> element will be "in" the namespace with the URI: http://www.ibm.com/Schemas/WD03/target The value of the default namespace, xmlns, and the targetNamespace are the same.

Copyright IBM Corporation 2004

Figure 7-80. Putting a Schema in a Namespace

XM3014.1

Notes:
The first piece of XML Schema's namespace support allows us to declare that a set of schema components is associated with a particular namespace. targetNamespace attribute

7-90 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XML Schemas and Namespaces


In the XML schema below, the default namespace for the schema is defined as the standard XML schema namespace http://www.w3.org/2001/XMLSchema; there is also a schema specific namespace http://www.ibm.com .
<?xml version="1.0"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.ibm.com" xmlns:TestSchema="http://www.ibm.com"> <simpleType name="ZipCodeType"> <restriction base="integer"> <minInclusive value="10000"/> <maxInclusive value="99999"/> </restriction> </simpleType>
<!--element definitions skipped -->

</schema> . . .in file: C:\temp\TestSchema.xsd

Copyright IBM Corporation 2004

Figure 7-81. XML Schemas and Namespaces

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-91

Student Notebook

Target Namespace and Schema Location


The target namespace serves to identify the namespace within which the association between the element and its name exists. In the case of declarations, this association determines the namespace of the elements in XML files conforming to the schema. An XML file importing a schema must reference its target namespace in the schemaLocation attribute. Any mismatches between the target and the actual namespace of an element are reported as schema validation errors. In our example, the target namespace is http://www.ibm.com; it is defined in the XML schema file and referenced twice in the XML file. Any mismatch between these three occurrences of the namespace lead to validation errors.

Copyright IBM Corporation 2004

Figure 7-82. Target Namespace and Schema Location

XM3014.1

Notes:

7-92 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Finding the Schema


Instance documents reference the schema using the following element syntax:
<elem xmlns="targetNamespaceURI" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="targetNamespaceURI schemaLocation">

where schemaLocation is a legal URI To reference the schema on the previous slide use:

Whitespace

<quantity xmlns="http://www.ibm.com/WD03/Schemas/XML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com/WD03/Schemas/XML schema.xsd">25</quantity>

Assuming the schema is in this file

xsi:schemaLocation or xsi:noNamespaceSchemaLocation are hints to the parser of the schema's location.


Copyright IBM Corporation 2004

Figure 7-83. Finding the Schema

XM3014.1

Notes:
SchemaLocation attribute The XML Schema recommendation does not provide a definitive mechanism for an XML processor to use to locate the schema components associated with a namespace. Instead, it provides the schemaLocation attribute in the XML Schema Instance Namespace. We'll cover the XML Schema Instance namespace in greater detail in a few moments. The schemaLocation attribute is a list of pairs of URIs. The first URI in each pair is a namespace URI. The second URI in each pair is a URI that can be resolved to find the schema for the namespace URI specified by the first element of the pair. Thus, the schemaLocation attribute can be used to specify the locations of the definitions for multiple namespaces in a single location. The schemaLocation attribute may appear on any element in the instance document, as long as it appears before any element in the namespaces that it is providing location hints for. One more important thing to note is that the schemaLocation attribute provides a hint. A particular schema processor is permitted to ignore the schemaLocation hints, or provide

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-93

Student Notebook

its own method for location the schema components associated with a particular namespace. Lastly, there is a variant of schemaLocation, called noNamespaceSchemaLocation, which should be used when a set of schema components is not associated with a target namespace. This version does not use a a list of pairs, since there is no namespace involved. It simply accepts a URI that can be used to locate the schema. We saw an example back in Part I. The example on this foil shows a hint that says the scheme components for the namespace http://www.ibm.com/Schemas/WD03/target can be found by resolving and processing the URI http://www.ibm.colm/Schemas/WD03/target.xsd.

7-94 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Best Practices (1 of 2)
complexTypes versus elements Use types if you need to reuse both model and attribute group combinations. complexTypes versus model groups Use model groups if you are only reusing content model fragments. complexTypes versus attribute groups Use attribute groups if you are only reusing sets of attribute definitions. Local versus global types Use global types if you need reuse, local types if you need to keep things nicely scoped.

Copyright IBM Corporation 2004

Figure 7-84. Best Practices (1 of 2)

XM3014.1

Notes:
complexTypes versus elements Use types if you need to reuse model and attribute group combinations. Recall that a complex type has both a content model and a set of attributes associated with it. complexTypes versus model groups Use model groups if you just need to reuse content model fragments Here we're following the principle of using the least powerful tool for getting the job done. This will also have performance implications because there is less processing during the schema validation. complexTypes versus attribute groups Use attribute groups if you just need to reuse sets of attribute definitions. The reasoning here is similar to the above. local versus global types

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-95

Student Notebook

Use global types if you need reuse, local types if you need to keep things nicely scoped. Using global types is a requirement if you want to reuse the type across element declarations. If you want to keep related components under tighter control, then you should use local types, as that restricts the effect of your definitions.

7-96 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Best Practices (2 of 2)
Namespaces Always put schemas in a namespace. Import versus wildcards Use import when you want to use the imported types in your schema. Use wildcards when you just want to allow elements to appear.

Copyright IBM Corporation 2004

Figure 7-85. Best Practices (2 of 2)

XM3014.1

Notes:
Namespaces Always put schemas in a namespace. There's no good reason not to put your schema in a namespace, and if you ever end up interacting with another company/entity, you're going to want to have your schema in its own namespace. This also lets you place your declarations in another file. Import versus wildcards Use import when you want to use the imported types in your schema. Although we didn't cover it, you can use XML schema's inheritance features to derive new types from imported types. Use wildcards when you just want to allow elements to appear. Wildcards have a different effect because they allow you to control individual elements or subtrees of the document Explain the why or benefits

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-97

Student Notebook

References
Resource
http://www.w3c.org/XML/Schema

Description
Information on tools, status of the spec., links to useful info

http://www.w3.org/TR/xmlschema-0

Readable schema primer by IBM's David Fallside.

http://www.xfront.com/xml-schema.html Schema tutorial www.alphaworks.ibm.com Visual DTD - includes support for XML Schema

Copyright IBM Corporation 2004

Figure 7-86. References

XM3014.1

Notes:

7-98 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit Summary
Having completed this unit, you should understand: The reasons for using XML Schema The important features of XMLSchema How to define the grammar rules for a document in XML Schema Best practices for using XML Schema Status of XML Schema at W3C

Copyright IBM Corporation 2004

Figure 7-87. Unit Summary

XM3014.1

Notes:
In this section, you have been exposed to: The basic functionality of XML Schema Simple type definitions Complex type definitions Attribute definitions Model Group definitions Attribute Group Definitions Element declarations XML Schema namespace functionality Import any Best practices for using XML Schema
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Unit 7. XML Schema

7-99

Student Notebook

Status of XML Schema at W3C Current state of tools for working with XML Schema

7-100 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit 8. XPath - XML Path Language


What This Unit is About
This unit introduces the XML Path Language, better known as XPath. XPath provides a means to address specific sections of an XML document based on criteria.

What You Should Be Able to Do


After completing this unit, you should be able to: Describe the reasons for using XPath Define the components and constructs that make up the XML Path Language Write simple XPath expressions Identify abbreviated XPath expressions Describe how to partition the XPath document Describe how XPath can reference XML documents Define the current status of XPath in industry

How You Will Check Your Progress


Accountability: Checkpoint Machine exercises

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Unit Objectives
After completing this unit, you should be able to: Describe the reasons for using XPath Define the components and constructs that make up the XML Path Language Write simple XPath expressions Identify abbreviated XPath expressions Describe how to partition the XPath document Define the current status of XPath in industry

Copyright IBM Corporation 2004

Figure 8-1. Unit Objectives

XM3014.1

Notes:

8-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

What Is XPath?
A specification for querying an XML document. Does not have an implementation independent of other standards/technologies. Used by XSLT, XPointer, and other emerging technologies, such as XQuery. Often when processing XML, we need to address (locate) a portion of or elements of the document which meet specified criteria. Example: In XML for a book on Java, find the chapters with JDBC in the title. Provides the ability to address any slice of an XML document in any direction. Either forwards, backwards or sideways. W3C Recommendation (Nov. 16, 1999).

Copyright IBM Corporation 2004

Figure 8-2. What Is XPath?

XM3014.1

Notes:
XPath was defined during the development of XSLT (XML Stylesheet Language Transformation) and XPointer. It was designed to provide unambiguous traversal of XML documents. XPointer and XSLT use XPath's functionality, XSLT uses only a subset of XPath; XPointer uses additional syntax mechanisms to extend its functionality. XPointer allows forward and backward addressing to specific XML locations internal to a document and to locations in external XML documents. Think of this as a super-enhanced version of HTML's HREF linking. XQuery is an emerging technology that will provide standardized access to RDBMS data stores using XML.

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Why Is It Called XPath?


XML documents are frequently viewed as a tree of nodes. Expressions describe a path to a given node or set of nodes. Consider the DOS, UNIX, or URI syntax for addressing files in a directory structure. /publications/articles/Transformations.xml This is called a pathname to the file. It describes the path to follow, from the root, through a tree of directories (folders), to locate a given file. Similarly, XPath also uses a forward slash to separate the nodes of a path.

Copyright IBM Corporation 2004

Figure 8-3. Why Is It Called XPath?

XM3014.1

Notes:
Paths are a natural way to express a hierarchical structure. DOS and Windows actually use a backslash to represent the path separators. URI's, XPath, and most other path addressing schemes use a forward slash, as backslash is used to escaped special characters. Example: '' represents a TAB character

8-4

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Example Tree Representation of XML


<?xml version="1.0"?> <book> <author>Tom Wolfe</author> <title>The Right Stuff</title> <price>$6.00</price> </book>

ROOT <book>

address = "/" address = "/book"


<price>

<author>

<title>

address = "/book/price" address = "/book/*"

"Tom Wolfe"

"The Right Stuff"

"$6.00"

address = "/book/price/text()"

Copyright IBM Corporation 2004

Figure 8-4. Example Tree Representation of XML

XM3014.1

Notes:
This example shows a typical XML document and how it is represented as a tree of nodes. This conceptual depiction of XML is important to understand. As you can see in the tree diagram, there is a single root node, that contains several other types of nodes. There are a total of seven node types in XML. They are: root nodes element nodes text nodes attribute nodes namespace nodes processing instruction nodes comment nodes

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XPath Expression Evaluation


An XPath expression is a series of steps. A step is a search criteria statement Example, find figures in the current chapter An XPath expression has a current context. A node in the tree that is the starting point for the step Example, current chapter in the book Each step, except the last, must evaluate to a set of nodes in the XML tree. Example, all the chapters in a book Steps are evaluated against one or more nodes The resulting set of nodes may be empty The last step returns one of the following: Number Boolean String Node-set
Copyright IBM Corporation 2004

Figure 8-5. XPath Expression Evaluation

XM3014.1

Notes:
Think of the XPath Express as a series of steps through the XML tree. Each step is a rung in the ladder, or layer of the tree. Wildcards permit a single step to represent many layers, much like skipping several rungs when climbing down the ladder.

8-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XPath Current Context


The active element within the XPath address step /Root/.../Ancestor/Parent/SELF/Child/Descendant
/ (Root)

.../Ancestor

/Parent

preceding-sibling

/Self (Context Node)

following-sibling

Note: Self is always a single node. It can only have one parent and one root. It may have multiple children, ancestors and so forth.
Copyright IBM Corporation 2004

/Child

/Descendant/...

Figure 8-6. XPath Current Context

XM3014.1

Notes:
The current context is simply a "you are here" designation within a complete XPath address. As an XPath expression is evaluated the current context is likely shifting. Relative paths don't make sense as a stand-alone entity. They must be combined in some other context based from the documents root.

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-7

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XPath Step Syntax


.../axis::nodetest[predicate]/...

An XPath location path is made up of one or more steps separated by a forward slash ("/"). Each step within the path consists of: Axis: Branch of the node tree relative to the current context node. NodeTest: Tests node for inclusion. Predicate: Optional filter of matched nodes. Example: Locate all chapters titles in the book that contain the string 'XPath'
/book/child::chapter/child::title[contains(text(),'XPath')]/

Copyright IBM Corporation 2004

Figure 8-7. XPath Step Syntax

XM3014.1

Notes:
XPath uses a path notation similar to URLs. Location paths are specified using a forward slash ("/") separated list of steps. XPath provides a simple method to traverse an XML tree structure, and to select a slice of information in any direction as defined by the Axis. Paths starting with a forward slash are absolute paths from the root downward through the document tree; paths not beginning with a slash are relative to the current (context) node of the node list.

8-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XPath Address Notation


An address is a node or nodes in a tree that is your starting point for searching. Abbreviated Short-Form syntax is allowed for several different axes. "child::" has an empty default as it is the default axis "/child::catalog/child::tools/" is the same as "/catalog/tools/" A complete XPath expression may consist of only a location path. Absolute location path: Starts search at the root of the tree Search begins with a forward slash Relative location path: Sequence of one or more location steps, or referenced from the current context node.

Copyright IBM Corporation 2004

Figure 8-8. XPath Address Notation

XM3014.1

Notes:
The XPath statement can be expressed with the full syntax, or can be abbreviated. Many Axes have an abbreviated syntax. "Child::" is the most common, and therefore has an empty abbreviation. All axes and abbreviations are discussed later. Absolute path is addressed based from the document's root. /child::catalog/child::tools - The full syntactic expression that returns all tools element children of the catalog element that appear under the document's root; short form - /catalog/tools Relative path is based on the Current Context of the addressing path. child::tools/child::saw - The full expression of a path relative to the context node; short form - tools/saw.

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-9

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Example: Absolute Addressing


1. /paper/chapter[1]/section[2]/title 2. /paper/chapter/title 3. /paper/*/title Title for first chapter, second section Titles for all chapters Any title that is a child of any element child of paper

root
2

paper chapter section title


1
Copyright IBM Corporation 2004

title chapter title section title

appendix section @status title title section title section title


XM3014.1

section title

title section title

Figure 8-9. Example: Absolute Addressing

Notes:
Here are the results of running Studio's XPath Expression Wizard using the expressions shown against the #document root. 1. produced: <title>Sect.1.2 Title</title> 2. produced: <title>Ch. 1 Title</title> <title>Ch. 2 Title</title> 3. produced <title>Ch. 1 Title</title> <title>Ch. 2 Title</title> <title>App.A.1 Title</title>

8-10 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Example: Absolute Addressing with Predicates


1. /paper/*/section[last()]/title 2. /paper/*/section[last()-1]/title 3. /paper/chapter[1]/section[title='Sect.1.1 Title']/title Titles for last sections Titles for the second-to-last sections Select title by name

root

2
title chapter title title section section title section title title

paper chapter section title section @status title appendix title section title section title

"Sect.1.1 Title"

1
Copyright IBM Corporation 2004

Figure 8-10. Example: Absolute Addressing with Predicates

XM3014.1

Notes:
1. produces: <title>Sect.1.2 Title</title> <title>Sect.2.2 Title</title> <title>App.A.1.1 Title</title> 2. produces: <title>Sect.1.1 Title</title> <title>Sect.2.1 Title</title> 3. produces: <title>Sect.1.1 Title</title> If you replace /title in 3 with /text() the answer is "Section 1.1".

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-11

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Relative Addressing using Studio


In order to be able to test the paths on the next page, you must first position Studio at the current context shown in the diagram Here's the screen capture showing the starting point as section1 of the second chapter.

How do we know we really moved to /paper/chapter[2]/section[1] ? Add text() as a node test to produce: "Section 2.1" when we execute.
Copyright IBM Corporation 2004

Figure 8-11. Relative Addressing using Studio

XM3014.1

Notes:
Remember we bring up the XPath Expression Wizard by selecting the appropriate XML instance and right-clicking to open the context-sensitive menu from which we choose Generate (near the bottom of the menu) -> XPath. . . If you're unsure about your location, position the wizard at the root and prepend the relative path to create an absolute path. Test your results both ways. See if the result is unchanged.

8-12 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Example: Relative Addressing


/paper/chapter[2]/section[1] - Absolute path to "current context" Parent of current context 1. parent::node() or .. 2. self::node() or . 3. ../.. 4. child::* (default) 5. ./following-sibling::node()/@status or ./following-sibling::*/@status Context node (self) Parent of parent of context node Children of the current context node Status attribute of any following sibling node siblings
root

1
title title title chapter section section title
Figure 8-12. Example: Relative Addressing

3
paper chapter

2
appendix section @status title title section title section title
XM3014.1

section title

title

section title

4
Copyright IBM Corporation 2004

Notes:
1. produces:<chapter> Chapter 2 <title>Ch. 2 Title</title> <section> Section 2.1 <title>Sect.2.1 Title</title> </section> <section status="Sect.2.2 Status"> Section 2.2 <title>Sect.2.2 Title</title> </section> </chapter> 2. produces:<section> Section 2.1 <title>Sect.2.1 Title</title> </section>
Copyright IBM Corp. 2001, 2004 Unit 8. XPath - XML Path Language 8-13

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

3. produces:**everything in the instance file** 4. produces:<title>Sect.2.1 Title</title> 5. produces:Sect.2.2 Status

8-14 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XPath - The Thirteen Axes


Axis Name ancestor ancestor-or-self attribute child descendant descendant-or-self following following-sibling namespace parent preceding preceding-sibling self Description Ancestors of context node; parent, grandparent, and so forth. Context node and its ancestors Attributes of the context node Children of the context node Descendants of the context node; child, grandchild, and so forth Context node and its descendants All nodes that follow the context node, not including descendants, attributes and namespaces All siblings that follow the context node. Namespace node of context node Parent of context node if it exists. Parent of attribute or namespace is the element that contains it. All nodes that are before the context node, not including ancestors, attributes and namespaces All siblings that precede the context node The context node

Copyright IBM Corporation 2004

Figure 8-13. XPath - The Thirteen Axes

XM3014.1

Notes:
There are 13 axes defined in XPath that enable searching of different parts of the XML Document from the current context node or the root. The commonly used axes, such as attribute, child and descendent-or-self have a shorthand syntax. If the shorthand syntax is used, the "::" separator that follows the axis name is omitted. child:: is the default axis if no axis is specified; all axes can be used in a relative or absolute path. Despite the singular form of axes names like ancestor or preceding-sibling, only parent and self always refer to a single node.

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-15

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Abbreviated Step Notation


The following step abbreviations may be used to simplify XPath location paths.
Step Abbreviation <blank> Abbreviation For child:: e.g. chapter/section expands to child::chapter/child::section (all the section children of all the chapter children of the context node) self::node() e.g. ./attribute::name expands to self::node()/attribute::name (the name attribute of the context node) parent::node() e.g. ../attribute::name expands to parent::node()/attribute::name (the name attribute of the parent of the context node) attribute:: e.g. ./@name expands to self::node()/attribute::name (the name attribute of the context node) /descendant-or-self::node()/ e.g. .//chapter expands to ./descendant-or-self::node()/chapter (all the chapter descendants of the context node)

..

//

Copyright IBM Corporation 2004

Figure 8-14. Abbreviated Step Notation

XM3014.1

Notes:
There are 13 axes defined in XPath that enable searching of different parts of the XML Document from the current context node or the root. The commonly used axes, such as attribute, child and descendent-or-self have a shorthand syntax. If the shorthand syntax is used the "::" separator that follows the axis name is omitted. child:: is the default axis if no axis is specified; all axes can be used in a relative or absolute path. Despite the singular form of axes names like ancestor or preceding-sibling, only parent and self always refer to a single node.

8-16 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XPath - Partitioning the Document


Self, ancestor, descendant, preceding and following partition the entire document.

Ancestor
root

Preceding
title title title chapter section section title
Figure 8-15. XPath - Partitioning the Document

paper chapter section title title section title

Following Self
section @status title appendix title section title section title
XM3014.1

Descendant
Copyright IBM Corporation 2004

Notes:
Self = Context node. For the node labeled Self, which is the current context node, the labels on the various nodes indicate their axis relationship to Self. These four axis contain all the nodes within the document, and do not overlap.

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-17

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Example: Addressing with Axes


Everything after the last chapter. 2. /paper/chapter[2]/descendant::node()/title All title descendants of chapter 2. 3. //*[attribute::status] also //*[@status] All element nodes containing a status attribute.
1. /paper/chapter[last()]/following::*
root

1
paper title title title chapter section section title
Figure 8-16. Example: Addressing with Axes

chapter section title title section title section @status title

appendix title section title section title


XM3014.1

2
Copyright IBM Corporation 2004

Notes:

8-18 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XPath Axis Node Type and Node Tests


Axis Type attribute namespace all other axes
Node Test
* (Wildcard) Qualified Name (Namespace) NCName:* text() processing-instruction() comment() node() id("value")

Type of nodes returned attribute node namespace node element node


Result
Select all nodes of the given axis type Selects node if it has the specified namespace qualified name (if Namespace is null, than name is not in any namespace) Selects node if it has the specified namespace Returns text node children Returns the processing instruction (for PI nodes). The processing-instruction node test can have an optional predicate which contains a literal Returns the comment (for comment nodes) Is true for any node of any type whatsoever Returns the node containing an ID type attribute of the specified value
Copyright IBM Corporation 2004

Figure 8-17. XPath Axis Node Type and Node Tests

XM3014.1

Notes:
The first table lists the types or axes and the corresponding type of node returned. This list only indicates the principal node type. For example, an axis of child::* will return nodes of type element. The returned elements may have child nodes that are of type attribute. The second table lists the node tests and the resulting node (or node list). A Node Test follows the Axis in the address step and qualifies the node to be included/excluded in the search. The most common form of node test is the QName or actual element name. The wildcard ("*") node test selects all nodes of the given type. attribute::* = selects all attributes

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-19

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Sample Node Tests


//comment() Extract all comments from a document. /book/*/title Extract all top-level titles regardless of parent type (that is, Chapter, Appendix, and so forth). /processing-instruction() Extract all Processing Instructions that exist outside of the root element. /book/chapter[2]//text() Extract the actual text from all elements inside the second chapter. chapter/section[2][@status="Draft"] Extract the second section child of every chapter child of the context node where the section status attribute has a value of "Draft".
Copyright IBM Corporation 2004

Figure 8-18. Sample Node Tests

XM3014.1

Notes:
These samples are shown without a representative tree. The meaning of the samples can usually be conveyed from the XPath expression itself.

8-20 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XPath - Predicates (1 of 2)
All comparisons or function calls are within the predicate, enclosed within [ ]. Predicates test a set of nodes and return one of: A new set of nodes A string A boolean A number Each node in the list of nodes is tested to see if the predicate is true. If predicate is true the node is included in the resulting list of nodes. If a predicate results in no matching nodes, an empty result set is returned.

Copyright IBM Corporation 2004

Figure 8-19. XPath - Predicates (1 of 2)

XM3014.1

Notes:
Predicates filter a list of nodes. Predicate expressions can be function calls, numbers, literals or location paths.

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-21

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XPath - Predicates (2 of 2)
Predicate expression types Function call Number Literal Location path Extra logic operators can be used inside a predicate. Allows boolean and and or chaining of tests. Math operators allowed in predicates for numbers. Equality = operator Less than < and Greater than > operators Modulus test using the mod() function

Copyright IBM Corporation 2004

Figure 8-20. XPath - Predicates (2 of 2)

XM3014.1

Notes:
A predicate expression can contain logical operators and - Both conditions must be true or - Either condition may be true for the test to be true. May also be expressed by using the pipe character ("|")

8-22 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Predicate Core Functions


Function last( ) position( ) count(node-set-expr) Return Type number Description Returns the index of the last node in the current context, that is, the context size Returns the index of the current node within number the context Returns the number of nodes in the node-set number identified by the given expression Returns a node-set containing the nodes that have the specified IDs. The object parameter node-set can contain more that one node (in which case the node set that is returned may contain more than one node) Split a fully qualified name (namespace:object) local-name(): Returns the object's name namespace-uri(): Returns the namespace URI Returns the fully qualified name for the first node in the node-set

id(object)

local-name( ) local-name(node-set-expr) string namespace-uri( ) namespace-uri(node-set-expr) name(node-set-expr) string

node-set-expr = a relative or absolute path


Copyright IBM Corporation 2004

Figure 8-21. Predicate Core Functions

XM3014.1

Notes:
The table lists the XPath predicate functions that are part of the core function library. The return type is shown in the second column. A few functions have optional arguments. If omitted, the current context node is treated as the argument. /child::chapter[position()=1] returns the first chapter element that is under the document root. /chapter[1] is the abbreviated form of the expression above.

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-23

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Predicate String Functions (1 of 2)


String Functions string ( ) string (object) starts-with (string, string) Return Type string boolean Description Converts the argument or the context node into a string Returns true if the first string arguments starts with the second string argument Returns true if the first string argument contains the second string argument Returns the substring of the first argument string following the first occurrence of the second argument Returns the substring of the first argument string preceding the first occurrence of the second argument Returns a substring of the first argument string, starting at the index (first number) for the optional count (second number)
XM3014.1

contains (string, string)

Boolean

substring-after (string, string)

string

substring-before (string, string) substring (string, number) substring (string, number, number)

string

string

Copyright IBM Corporation 2004

Figure 8-22. Predicate String Functions (1 of 2)

Notes:
string(object)- Only the first node of the argument node-set is converted to a string. Numbers (integer or floating point) are converted to their string representation. Booleans are converted to the strings "true" and "false". All other node types are converted depending on the type of node. For example, the string value of an element is all the characters of the element and its descendants concatenated together. substring-after(string, string)- For example, substring-after("XML Development","lop") will return the string "ment". If there is more than one occurrence of the substring, all characters after the first occurrence will be included in the returned string. substring-before function works in a similar manner, except that it returns the substring before the tested string.

8-24 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Predicate String Functions (2 of 2)


String Functions string-length ( ) string-length (string) concat (string, string, ...) Return Type number string Descriptions Returns the string length Returns a concatenation of its arguments. Must have at least two arguments Removes leading and trailing whitespace and replaces adjacent whitespace characters with a single whitespace Returns the first argument string with each character that appears in the second argument string replaced by the corresponding character in the third argument string. Example: translate("abc", "cb", "ex") returns "axe"

normalize-space ( ) normalize-space (string)

string

translate (string, string, string)

string

Copyright IBM Corporation 2004

Figure 8-23. Predicate String Functions (2 of 2)

XM3014.1

Notes:
Almost any object type can be passed into string functions. The processor will attempt to convert non-string objects to their string representation.

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-25

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Predicate Number and Boolean Functions


Number Functions number ( ) number (object) Return Type number Descriptions Returns numeric representation of an object boolean true returns 1 boolean false returns 0 Returns sum of values of nodes of the node set Returns largest integer that is not greater than argument (rounds down) Returns smallest integer that is not less than argument (rounds up) Returns closest integer to argument

sum (node-set) floor (number) ceiling (number) round (number)

number number number number

Boolean Functions not (boolean) true ( ) false ( )

Return Type boolean boolean boolean

Descriptions Returns true if argument is false and false otherwise Returns true Returns false
XM3014.1

Copyright IBM Corporation 2004

Figure 8-24. Predicate Number and Boolean Functions

Notes:
Boolean function return values: Number is true if it is non-zero and not Not-a-Number Node-set is true if it is non-empty String is true if its length is non-zero NaN (Not a Number) - Is not a number value, positive and negative infinity, and positive and negative zero.

8-26 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Reference Information
Reference
http://www.w3.org/TR/xpath http://www.xml.com /pub/a/2000/12/20/xpathaxes.html /pub/a/2001/01/03/xpathaxes.html /pub/a/2000/10/04/transforming/trxml5.html http://www.zvon.org /xxl/XPathTutorial/General/examples.html http://www.zvon.org:9001/saxon/cgi-bin/XLab/XML/ xlabIndex.html?stylesheetFile=XSLT/xlabIndex.xslt http://www.cranesoftwrights.com/training/#ptux XSLT and XPath training materials, Ken Holman, Crane Softwrights Ltd.

Description
W3C XPath specification Good articles on XPath Axis and node from O'Reilly's www.xml.com site.

Interactive XPath tutorial

XSLT Programmer's Reference 2nd Edition, Michael XSLT books cover XPath. Kay, WROX Press Very good XSLT reference book

Copyright IBM Corporation 2004

Figure 8-25. Reference Information

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-27

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Checkpoint Questions (1 of 3)
1. Which of the following are part of the XPath step syntax? A. Predicate B. AxisName C. Ancestor D. Ceiling E. NodeTest 2. The axis shorthand notation of // indicates what? A. Ancestor B. Parent C. Ancestor-or-self D. Descendant-or-self

Copyright IBM Corporation 2004

Figure 8-26. Checkpoint Questions (1 of 3)

XM3014.1

Notes:

8-28 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Checkpoint Questions (2 of 3)
3. Which XPath statement will return the number of questions on a test? A. count(/test/question) B. /test/question/count() C. /test[count(question)] D. None of the above 4. The predicate function starts-with("XML is Great", "XML") will return: A. XML B. True C. Is Great D. False E. XML is Great

Copyright IBM Corporation 2004

Figure 8-27. Checkpoint Questions (2 of 3)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-29

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Checkpoint Questions (3 of 3)
5. The following XPath statement will result in --

/news/story[@year='2001']/ self::node()[contains(text, 'IBM')]/


A. All 2001 news stories that contain IBM inside the text element. B. All news stories with a year element = 2001 and a text element of IBM. C. Any news story with either IBM or 2001 in its text. D. All 2001 news stories that contain the letters IBM in any order. E. Error, as this is an invalid XPath statement.

Copyright IBM Corporation 2004

Figure 8-28. Checkpoint Questions (3 of 3)

XM3014.1

Notes:

8-30 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit Summary
Having completed this unit, you have learned to: Describe the reasons for using XPath Define the components and constructs that make up the XML Path Language Write simple XPath expressions Identify abbreviated XPath expressions Describe how to partition the XPath document Define the current status of XPath in industry

Copyright IBM Corporation 2004

Figure 8-29. Unit Summary

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 8. XPath - XML Path Language

8-31

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

8-32 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)


What This Unit is About
This unit describes Extensible Stylesheet Language and its concepts.

What You Should Be Able to Do


After completing this unit, you should be able to: Describe the XSL model and its concepts Describe and apply XSL Transformations Use and apply XSL templates in XSLT Create simple XSL stylesheets Define XSL Format Objects Describe the programming models and best practices for XSLT Describe XSLT tools

How You Will Check Your Progress


Accountability: Checkpoint Machine exercises

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Unit Objectives
After completing this unit, you should be able to: Describe the XSL model and its concepts Describe and apply XSL Transformations Use and apply XSL templates in XSLT Create simple XSL stylesheets Describe some best practices for applying XSLT Describe XSLT tools

Copyright IBM Corporation 2004

Figure 9-1. Unit Objectives

XM3014.1

Notes:

9-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Why Do We Need XSL Transformations?

"A typical enterprise will devote 35-40% of its programming budget to develop and maintain 'extract and update' programs whose purpose is solely to transfer information between different database's of legacy systems." --Gartner Group

Copyright IBM Corporation 2004

Figure 9-2. Why Do We Need XSL Transformations?

XM3014.1

Notes:
One of XSLT's best applications is to translate information from one XML vocabulary to another. As such it is a powerful tool for performing the 'extract and update' operations referred to in this quote.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Why Do We Need XSL?


Browser PDA WML
JSP/servlets Applications or XSL (T/FO)
Typical uses: XML to HTML XML to XML (vocabulary translation) XML to non-XML (For example, plain text, csv)
Copyright IBM Corporation 2004

DB

B2B application Web Service

Printing Publishing
XM3014.1

Figure 9-3. Why Do We Need XSL?

Notes:
In order to satisfy different clients, the Web server needs separate servlets / JSPs to format data for each kind of client. No need to develop and manage separate applications (for example, JSPs/servlets) for each target device. One JSP/servlet and one XSL Stylesheet for each client. No need to rewrite your servlets/JSP every time you want to change the presentation. There is normally no need to update the JSP/servlet each time the presentation is changed. (Easier to update style of your web site). Different companies not always use the same DTDs/Schemas for their XML documents. And when exchanging XML documents there is a need to transform from one to another tree.

9-4

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XSL: Three Parts


XSL XSLT XPath Formatting Objects

(Optional)

Transformation language A language for addressing parts of an XML document An XML Vocabulary for specifying formatting semantics

Copyright IBM Corporation 2004

Figure 9-4. XSL: Three Parts

XM3014.1

Notes:
Extensible Style Language (XSL) is comprises of two parts: the XSL Transformation (XSLT) specification is a Recommendation of the W3C as of Nov. 16, 1999, and Format Objects (XSL-FO), which is part of the actual XSL specification. The XSL specification is in Candidate Recommendation, the latest, as of this writing, being Aug, 2001. For more information on XSL (and format objects), see http://www.w3.org/TR/xsl. For more information on XSLT, see http://www.w3.org/TR/xslt. XSL formatting objects and properties, allow a large array for print, display or aural presentations. It is not the aim of this unit to cover FO in depth. First, the Format Object specification is still under development; and secondly, since the data is processed by XSLT, some formatting can be done in this stage (in the case of HTML, for instance), or by the XML application itself. XSL: Extensible Stylesheet Language Consists of two modules:
Copyright IBM Corp. 2001, 2004 Unit 9. eXtensible Stylesheet Language: Transformations (XSLT) 9-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

- XSL Transformation - XSL Format Objects Is compatible with Namespaces and XPath. XSL Transformations (XSLT) Operates on an abstract model that views an XML Document as a tree. It is not required that a tree be created. Provides a means to access the document tree in order to: - Access nodes by name or content - Search for specific content or nodes - Manipulate content or nodes Serves as a transformation filter before formatting is applied. XSL Format Objects (XSL-FO) Format objects received from XSLT into a result tree.

9-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XSLT Language Characteristics (1 of 2)


Uses XML Syntax. No side-effects: When functions have side-effects, they must be executed in a specific order. Parts of a stylesheet can be processed in any order and independently. Means you cannot update the value of a variable. Pattern-matching rule based/declarative: A stylesheet is a sequence of template rules each of which says how a particular node should be processed. No ordering is needed for the template rules. These rules make XSLT declarative - you declare what you want done to the XML nodes. XSLT relies on XPath for a large part of its pattern matching functionality.

Copyright IBM Corporation 2004

Figure 9-5. XSLT Language Characteristics (1 of 2)

XM3014.1

Notes:
XML Syntax was chosen for many reasons, among the most important were: Reuse of the XML parser minimizes footprint. Familiarity and ease of understanding. Reuse of the lexical apparatus of XML for handling whitespaces, Unicode, namespaces, and so forth. Providing visual development tools. A function is said to have side effects if it makes changes to its environment; an example is updating a global variable. The functions in XSLT have no side effects and can be processed in any order. In reality, how you code your stylesheet will impact what parts can be processed independently. Processing parts of the stylesheet in any order does not impact the order of the output. The order of the output depends on the order in the XML file - which is what we want. We will talk about programming without variables later
Copyright IBM Corp. 2001, 2004 Unit 9. eXtensible Stylesheet Language: Transformations (XSLT) 9-7

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Declarative languages are different from procedural languages. In procedural languages you specify the order for chunks of processing, in declarative languages you specify what processing you want done and the processor determines the order. Other examples of declarative languages are SQL and LISP.

9-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XSLT Language Characteristics (2 of 2)


Closure: Means the output has the same data structure as the input. For XSLT, this structure is the tree representation of XML. The input is XML and the output is a tree which is built as the input nodes are processed. Means you can combine operations in a pipe-like structure where the output of one operation is the input for another. Recursive: XSLT supports recursion with built-in constructs.

Copyright IBM Corporation 2004

Figure 9-6. XSLT Language Characteristics (2 of 2)

XM3014.1

Notes:
Closure is also a characteristic of SQL. There are many similarities between XSLT and SQL.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-9

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XSLT Features
Multiple input sources. Ability to select document fragments using XPath expressions. Named and/or pattern-based templates. Parameterized templates. Intermediate transformation state may be managed using variables. Stylesheets may be combined using include or import. Built-in support for output sorting and numbering. Both XML and non-XML output is supported. XSL processor extensions supported without side-effects on core function.

Copyright IBM Corporation 2004

Figure 9-7. XSLT Features

XM3014.1

Notes:
Main features of XSLT

9-10 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XSL Transformations (XSLT) Overview


XML Document
Source Tree

XSL Stylesheet

XSL Stylesheet Processor

transformation
Transform Result Tree

formatting (optional)

output (result tree) XML application


Copyright IBM Corporation 2004

Figure 9-8. XSL Transformations (XSLT) Overview

XM3014.1

Notes:
An XSL Transformation accepts XML from the source abstract tree model of document, known as the source tree, and processes this to produce a result tree. The XSL stylesheet defines the rules for transformation, based on the XML elements and attributes in the source tree. The stylesheet also may contain formatting information called format objects (or FOs) and applies those objects against the transformation. A single stylesheet can apply to multiple XML documents, provided the elements and structure are consistent with those specified by the style sheet. Note that XSL does not require result trees to use XSL-FO and thus can be used for general XML transformations. For example, XSL can be used to transform XML into well-formed HTML, that is, XML that uses the element types and attributes defined by HTML. Note also that the source XML document can invoke multiple XSL stylesheets. For example, the XML source could be processed by XSL to render HTML, an altered form of

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-11

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XML, voice markup, and rendered for print ... all from the same source document invoking multiple stylesheets. These could occur as separate parallel processes (each invocation running an XSL processor in a separate memory space) or sequentially (each invocation running after the previous one completes). The advantage to parallel processing is that an XSLT error in one stylesheet will not prevent the others from running, whereas in sequential processing, any downstream process will be terminated as well. The disadvantage of parallel processing is one of system memory usage.

9-12 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

The XSLT Process


XSL Stylesheet (Transformation)

match pattern
Source Tree

select correct template

yes

no
Result Tree

apply further templates?

create result node

Copyright IBM Corporation 2004

Figure 9-9. The XSLT Process

XM3014.1

Notes:
XSLT uses the ideas of pattern matching and templates. A stylesheet includes templates, which contain rules that associate them with one or more elements or attributes in the XML document. The templates contain the rules for both transformation and, optionally, formatting that is applied to the matching nodes. A template can also contain further pattern matching and instructions to apply further templates. The Process The XSL processor scans the source tree (the tree model of the source XML document). If a matching node (element or attribute) is found, the processor locates the appropriate template, and then applies the rules contained within. This results in the creation of a result tree node. If a template rule indicates that more templates should be applied, then the process is begun again by finding new matching nodes. The whole procedure ends when there are no templates left to process.
Copyright IBM Corp. 2001, 2004 Unit 9. eXtensible Stylesheet Language: Transformations (XSLT) 9-13

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Matching Patterns match against nodes. When using XPath traversals, the leftmost value becomes the context (current) node, even though you have pathed to a child node that exists further down. For example, if books/book/title is the path, then books remains the context node, even though you are matching against title.

9-14 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Anatomy of a Stylesheet
Identify XML document. Must enclose the entire stylesheet, must include namespace and version. Document level elements, for example, import, include, output, strip-space, key, param, variable, ...

<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL /Transform" version="1.0"> <!-- top-level elements --> one or more <xsl:template match="title"> templates <h1><xsl:apply-templates/></h1> ... </xsl:template> </xsl:stylesheet>

Literal result text and/or XSL processing directives, mixed freely


Copyright IBM Corporation 2004

Figure 9-10. Anatomy of a Stylesheet

XM3014.1

Notes:
In this visual we have an overview of the main elements that make up the XSL stylesheet and the order they appear.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-15

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Elements to Generate Output


<xsl:value-of select="validXPathExprOrFunction"> <xsl:procssing-instruction name="piName"/> Inserts a processing instruction into the result tree <xsl:apply-templates /> <xsl:comment> Inserts a comment into the result tree <xsl:text> Inserts text into result tree verbatim Used when outputting special characters, particularly whitespace

Copyright IBM Corporation 2004

Figure 9-11. Elements to Generate Output

XM3014.1

Notes:
<xsl:comment>commentTextHere</xsl:comment> Inserts a comment into the result tree. Example: <xsl:template match="books/book//ordno [@instock('no')]"> <xsl:comment>Reorder now</xsl:comment> </xsl:template> The following comment element is inserted in the result tree: <!-- Reorder now --> <xsl:text>Inserted Text</xsl:text> Inserts text into result tree verbatim Note that this is not the same as testing for the presence of a comment within the source nodes. That is done with the comment() function in the test's predicate.
9-16 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

<xsl:stylesheet Element
xsl:stylesheet is the root element of an XML stylesheet. Requires the XSL namespace. Current recommendation:
<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> Alternatively (not supported in all XSL processors):


<?xml version="1.0" encoding="UTF-8"?>

<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

Copyright IBM Corporation 2004

Figure 9-12. <xsl:stylesheet Element

XM3014.1

Notes:
Note that the XML processing instruction occurs first, as the XSL stylesheet is itself an XML document. The namespace portion of the specification has changed recently, and may yet change again. It is best to verify the namespace at http://www.w3.org/tr/xsl or at http://www.w3.org/tr/xslt. Backward compatibility with earlier Working Drafts, as far as can be determined, is being maintained by the W3C. Microsoft Internet Explorer 5 and 5.5 use the http://www.w3.org/TR/WD-xsl namespace; the Xalan XSL processor uses http://www.w3.org/1999/XSL/Transform. Netscape 6 does not support XSL, it supports Cascading Style Sheets (CSS). The transform namespace is synonymous, can also be expressed <xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">. Not all XSL processors, however, will recognize the transform versions (especially older ones); but technically they are equivalent.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-17

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XSL Optional Elements


Stylesheets may: Import/include other stylesheets Set global variables, constants, or matching patterns Determine how to treat white space And so forth These elements appear before the first xsl:template. Allows inclusion of an external XSL stylesheet. <xsl:import href="pathToSource"/> <xsl:include href="pathToSource"/> Supplements calling stylesheet. <xsl:strip-space elements="..."/> or <xsl:preserve-space elements="..."/>

Control output format.


<xsl:output method="outputMethod" [optionalAttrs] />

Copyright IBM Corporation 2004

Figure 9-13. XSL Optional Elements

XM3014.1

Notes:
For detailed usage of these, and other elements in XSL documents, see http://www.w3.org/TR/xslt. Some of the other elements are discussed later in this unit, during the detailed discussion of transformations. The order in which child elements appear is not important, however the import element must appear first if it is used, the others precede the first xsl:template element. Import versus Include Import. The import element must appear before any other child elements in the XSL document. If a rules conflict results, then imported rules are of lesser importance than those in the document doing the importing. Include. The include statement is replaced by the actual contents of the included file, moving any import statements before it. Included rules are of equal importance to those in the calling document.

9-18 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

<xsl:template Element
A stylesheet has one or more templates. When <xsl:template> is missing, and literal result elements are found, a match="/" template is assumed (match the root).
<xsl:template match="match expression"> <!-- literal result text, XSLT elements --> </xsl:template>

Specifies: A match expression that defines when this rule will be used (the test against the nodes in the XML tree) - this is an XPath expression. Literal result text is written to the output tree, XSLT elements are executed.

Copyright IBM Corporation 2004

Figure 9-14. <xsl:template Element

XM3014.1

Notes:
First an explanation of the xsl:template element. It is, as its name implies, a template - a container for a set of rules to apply actions against the source tree to yield a result tree. It takes the form: <xsl:template match="nodeToMatch" [name="templateName"]> <!-- template actions insert here --> </xsl:template> If the name attribute is used, the match attribute can be excluded (see discussion below). Using the name attribute is optional, but if not used, the match attribute becomes required.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-19

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

<xsl:apply-templates Element
Means select all of the children of the current node in the XML source tree. For each one, find the matching template rule in the stylesheet and process that rule. Rules that can be matched are: None - you are not required to have a template rule for each child node. The template match rules you define. Default rules built in to the XSL processor. Applies rules recursively.

<xsl:template match="list"> <h1><xsl:apply-templates/></h1> ... </xsl:template>

Copyright IBM Corporation 2004

Figure 9-15. <xsl:apply-templates Element

XM3014.1

Notes:
In addition to the xsl:apply-templates element invoking further template processing; templates can be invoked by name, using the following: <xsl:call-template name="templateName"> [<xsl:with-param ...>] </xsl:call-template> This can be shortened to <xsl:call-template name="templateName"/>. The difference between xsl:apply-templates and xsl:call-template is that the current node and node list remains the same with the call function, the named template rules taking action against the current node and node list. The apply function will invoke the other templates, which may or may not change the current node and node list. Slide Example The apply-templates action occurs after the initial match and insertion of the value of the title element, would then go and apply other existing templates.
9-20 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Pattern Matching (XPath) Examples


Matching all children of book.
<xsl:template match="list/book/*">

Using an OR statement.
<xsl:template match="book/author | book/pub">

OR matches have the lowest priority Testing (using predicates) for a condition.
<xsl:template match="book[@bktype='paperback']">

Any XPath predicate test may be used Any test can use the not() function.
<xsl:template match="book[not(@bktype='paperback')]">

Copyright IBM Corporation 2004

Figure 9-16. Pattern Matching (XPath) Examples

XM3014.1

Notes:
You can skip nodes in the path by using the double slash, using the example /list/book/title, if you wanted to find any title element below books, you could use the path statement list//title. This would find any title, no matter how many levels below books it was. In these examples, we will use the <xsl:template match="nodeToSearch"> element. Template Matching Rules Template rules that have greater importance are chosen over those with lesser importance. This applies where a stylesheet has been imported into another, in this case, the imported templates have lesser importance than the ones native to the importing stylesheet. Templates can be assigned a priority using the priority attribute. Other Matches comment() - To match comment children of the current node. pi() - To match processing instructions of the current node.
Copyright IBM Corp. 2001, 2004 Unit 9. eXtensible Stylesheet Language: Transformations (XSLT) 9-21

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

id() - To match values of an element's attribute of type ID. text() - To match any text node.

9-22 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Default of <xsl:apply-templates />


Can be used to output the value of the node being matched when used with leaf nodes. This example creates HTML table cells containing book authors' names and book prices:
<xsl:template match="author|price"> <td><xsl:value-of select="."/></td> </xsl:template>

Can be rewritten with the same result, using apply-templates:


<xsl:template match="author|price"> <td><xsl:apply-templates/></td> </xsl:template>

This assumes that there are no elements nested in the content of <author> and <price>.

Copyright IBM Corporation 2004

Figure 9-17. Default of <xsl:apply-templates />

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-23

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

<xsl:value-of Element
<xsl:value-of select="patternToMatch"/>

Used to extract a specific value from the source tree. Inserts into result tree the verbatim contents of a string, an element or attribute from patternToMatch. Example:

<book ID = "999"> <author>Dan Big</author> <title>Large Stories</title> <price>$7.00</price> </book>

Result
<td>Large Stories</td>

<xsl:template match="list/book"> <td><xsl:value-of select="title"/></td> </xsl:template>

Copyright IBM Corporation 2004

Figure 9-18. <xsl:value-of Element

XM3014.1

Notes:
In this example, the contents of the title child of the book element will be extracted into a td element in the result tree. This element produces an output text node, the td element markup being supplied explicitly. <xsl:value-of select="."> is a common alternative to <xsl:apply-templates/> when there are no child nodes.

9-24 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Control Elements
<xsl:apply-templates/> <xsl:call-template name= "templateName"/> Calls a template named "templateName". <xsl:if> Allows a conditional test or tests. <xsl:choose> Allows a choice of one or more tests and permits a default condition: <xsl:when> the tested condition <xsl:otherwise> the default condition <xsl:for-each> Used to iterate over the result of an XPath expression.

Copyright IBM Corporation 2004

Figure 9-19. Control Elements

XM3014.1

Notes:
These elements are used to control the flow of the processing.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-25

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XML Input As a Tree


<?xml version = "1.0" encoding = "UTF-8"?> <!DOCTYPE list SYSTEM "books.dtd"> <list> <book ID = "888"> <author>John Smith</author> <title>New Cars</title> <price>$8.00</price> </book> <book ID = "999"> <author>Dan Big</author> <title>Large Stories</title> <price>$7.00</price> </book> </list>

Tree is created by parsing XML document.


Tree has an implicit root node <book> is subelement (child) of <list>
<author> ROOT <list> <book ID="888"> <book ID="999">

<list> is outermost node

<title>

<price>

"John Smith"

"New Cars"

"$8.00"

Books.xml

These children are text nodes

Subelements (children) of <book>

Note: The children of the second book are not shown

Copyright IBM Corporation 2004

Figure 9-20. XML Input As a Tree

XM3014.1

Notes:
Complete listing of files used in example to transform Books.xml to HTML using Books.xsl as a stylesheet. HTML produced by XSLT must be XHMTL compliant so that it is a valid XML tree structure which is produced. If you have invalid HTML (ex. <br> with not closing tag), the XSLT processor will throw an error.

9-26 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Desired HTML Output


<html> <head><title>Book List </title></head> <body> <h1>Book List</h1> <table border="1" cols="3" width="100%"> <tbody> <tr> <td>888</td> <td>New Cars</td> <td>$8.00</td> </tr> <tr> <td>999</td> <td>Large Stories</td> <td>$7.00</td> </tr> </tbody> </table> </body> </html>

The HTML produced must be well-formed.

Data taken from the XML document (nodes).

This is a tree structure with <body> as the child of <html>, <h1>, <table> are the children of <body>, and so forth.
Copyright IBM Corporation 2004

Figure 9-21. Desired HTML Output

XM3014.1

Notes:
HTML produced by XSLT must be XHTML compliant so that it is a valid XML tree structure which is produced. If you have invalid HTML (ex. <br> with not closing tag), the XSLT processor will throw an error.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-27

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XML to HTML (1 of 5)
<?xml version = "1.0" encoding = "UTF-8"?> <list> <book ID = "888"> <author>John Smith</author> <title>New Cars</title> <price>$8.00</price> </book> Books.xml ...

Processor looks for <xsl:template match = "/"> which matches our root element list. Found! Copies non-XSLT elements to the output tree in list template. So we get the first part of our HTML.

<?xml version="1.0" ?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head><title>Book List</title></head> <body> <table border="1" cols="3" width="100%" > <tbody> <xsl:apply-templates /> </tbody> </table> </body> </html> ... (remaining templates ommitted for clarity)
<html> <head><title>Book List</title></head> <body> <table borders="1" cols="3" width="100%"> <tbody>

HTML Output

Books.xsl
Copyright IBM Corporation 2004

Figure 9-22. XML to HTML (1 of 5)

XM3014.1

Notes:
First pattern match is the root element (in our case <list>). In this case it would not matter if it is "match=/" or "match="/" or match="list" plain html code is transferred over to the output tree. HTML produced by XSLT must be XHMTL compliant so that it is a valid XML tree structure that is produced. If you have invalid HTML (ex. <br> with not closing tag), the XSLT processor will throw an error.

9-28 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XML to HTML (2 of 5)
... <book ID = "888"> <author>John Smith</author> <title>New Cars</title> <price>$8.00</price> </book> ...

Books.xml

While processing match="/", we come to <xsl:apply-templates/>, the processor looks for templates for the children of "list" (that is, book), finds <xsl:template match="book">, and processes that template. <xsl:value-of select="@ID"> writes the value of the attribute ID to the output tree.
<html> <head><title>Book List</title></head> <body> <table borders="1" cols="3" width="100%"> <tbody> <tr> <td>888</td>

<xsl:template match="/"> ... <xsl:apply-templates /> ... </xsl:template> <xsl:template match="book"> <tr> <td><xsl:value-of select="@ID" /></td> <xsl:apply-templates select="author|price"/> </tr> </xsl:template>

HTML Output

Books.xsl
Copyright IBM Corporation 2004

Figure 9-23. XML to HTML (2 of 5)

XM3014.1

Notes:
<xsl:apply-templates> informs the processor to go to the next template match. In this case match="book".

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-29

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XML to HTML (3 of 5)
... <book ID = "888"> <author>John Smith</author> <title>New Cars</title> <price>$8.00</price> </book> ...

Books.xml

While processing match="book", we come to <xsl:apply-templates select="author|price"/>, the processor looks for templates for the author and price children of book. Finds <xsl:template match="author | price">, and processes that template. <xsl:value-of select ="."> writes the value of the element node to the output tree.
<html> <head><title>Book List</title></head> <body> <table borders="1" cols="3" width="100%"> <tbody> <tr> <td>888</td> <td>John Smith</td> <td>$8.00</td> </tr>

<xsl:template match="book"> <tr> <td><xsl:value-of select="@ID" /></td> <xsl:apply-templates select="author|price"/> </tr> </xsl:template> <xsl:template match="author|price"> <td><xsl:value-of select="." /></td> </xsl:template>

Books.xsl
Copyright IBM Corporation 2004

HTML Output

Figure 9-24. XML to HTML (3 of 5)

XM3014.1

Notes:
<xsl:templates match= " author | price "> The | is "or" from XPath The processor will call this template once for each author node, which is a child of book and once for each price, which is a child of book Later we will look at other options for generating the same result (in XSLT we have many processing options)

9-30 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XML to HTML (4 of 5)
... <book ID = "999"> <author>Dan Big</author> <title>Large Stories</title> <price>$7.00</price> </book> </list> Books.xml

Processor now looks for and finds another book node to process. Output for that book node is added to the output tree.

<xsl:template match="book"> <tr> <td><xsl:value-of select="@ID" /></td> <xsl:apply-templates select="author|price"/> </tr> </xsl:template> <xsl:template match="author|price"> <td><xsl:value-of select="." /></td> </xsl:template>
(Other templates ommitted for clarity)

<html> .. <tr> <td>888</td> <td>John Smith</td> <td>$8.00</td> </tr> <tr> <td>999</td> <td>Dan Big</td> <td>$7.00</td> </tr>

Books.xsl
Copyright IBM Corporation 2004

HTML Output

Figure 9-25. XML to HTML (4 of 5)

XM3014.1

Notes:
After processing the first <book> node, processing of the next book element takes place.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-31

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XML to HTML (5 of 5)
<?xml version = "1.0" encoding = "UTF-8"?> <list> <book ID = "888"> <author>John Smith</author> <title>New Cars</title> <price>$8.00</price> </book> Books.xml ...

Processor now finishes the processing of the root node and adds end tags for table, BODY, and HMTL.

<xsl:template match="/"> <html> <h1><p>Book List</p></h1> <body> <table border="1" COLS="3" width="100%" > <xsl:apply-templates /> </table> </body> </html> </xsl:template> (Other templates ommitted for clarity)

Books.xsl
Figure 9-26. XML to HTML (5 of 5)

<html> .. <tr> <td>888</td> <td>John Smith</td> <td>$8.00</td> </tr> <tr> <td>999</td> <td>Dan Big</td> <td>$7.00</td> </tr> </tbody> </table> HTML </body> </html>

Output

Copyright IBM Corporation 2004

XM3014.1

Notes:
After processing the first <book> node, processing of the next book element takes place.

9-32 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Calling <xsl:apply-templates/>
For greater control over which nodes are processed, explicitly call the templates for specific nodes. You are not looking for a specific template, but for the template for the specific node. In our prior example, we decided that only the price and author of the book should be output not the title.
<xsl:template match="book"> <tr> <td><xsl:value-of select="@ID" /></td> <xsl:apply-templates select="author|price" /> </tr> </xsl:template> <xsl:template match="author|price"> <td><xsl:value-of select="." /></td> </xsl:template>

Copyright IBM Corporation 2004

Figure 9-27. Calling <xsl:apply-templates/>

XM3014.1

Notes:
Use XPath statements to 'select' the nodes whose templates you are looking for.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-33

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Named Templates
Templates can be named and invoked by name. All templates must have a name or a match attribute. When calling a template, the current node remains unchanged. Defining a named template
<xsl:template name="bookTitle" > (must be a valid XML name) <h1><xsl:value-of select="."/></h1> </xsl:template>

Calling a named template


<xsl:template match="title"> <xsl:call-template name="bookTitle"/> <!-- any other template actions --> </xsl:template>

Copyright IBM Corporation 2004

Figure 9-28. Named Templates

XM3014.1

Notes:
The difference between xsl:apply-templates and xsl:call-template, is that the current node and node list remains the same with the call function, the named template rules taking action against the current node and node list. The apply function will invoke the other templates, which may or may not change the current node and node list.

9-34 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

<xsl:for-each Element
<xsl:for-each select="nodeSetExpression">

Used to iterate over the result of the select expression. Selected node becomes the current node.
<list> <book ID="666"> <author>Jim Blue</author> <title>Blue Flowers</title> </book> <book ID="888"> <author>John Smith</author> <title>New Cars</title> </book> <book ID="999"> <author>Dan Big</author> <title>Large Stories</title> </book> </list>

<xsl:template match="/"> <xsl:for-each select="//book"> <p><xsl:value-of select="title"/></p> </xsl:for-each> </xsl:template>

Books.xsl
<p>Blue Flowers</p> <p>New Cars</p> <p>Large Stories</p>

Books.xml
Copyright IBM Corporation 2004

Books.html

Figure 9-29. <xsl:for-each Element

XM3014.1

Notes:
Using for-each instead of apply-templates, is often called a pull model of processing, because you are explicitly choosing when to process the nodes. Best used when data is regular and predicatable. The for-each mechanism can be nested.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-35

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Time for a Lab

XSL Lab #1: Simple XSL Transforms

Copyright IBM Corporation 2004

Figure 9-30. Time for a Lab

XM3014.1

Notes:

9-36 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

<xsl:if Element
<xsl:if test="patternToMatch">

Used to conditionally process the matched expression. Can also be used with the logical not statement.
<list> <book ID="666"> <author>Jim Blue</author> <author>Mike Yellow</author> <author>Dan Farm</author> <title>Blue Flowers</title> </book> </list>

Blue Flowers by Jim Blue, Mike Yellow, and Dan Farm

Books.xml

<xsl:template match="list/book"> <xsl:value-of select="title"/> by <xsl:for-each select="author"> <xsl:value-of select="." /> <xsl:if test="position()!=last()">, </xsl:if> <xsl:if test="position()=last()-1"> and </xsl:if> </xsl:for-each> </xsl:template>
Copyright IBM Corporation 2004

Books.xsl
XM3014.1

Figure 9-31. <xsl:if Element

Notes:
The xsl:if conditional can be used to test for a certain situation within a template. It can be used in conjunction with other actions. More than one xsl:if action can appear within a template. Note that the condition test also contains a logical OR (British Pounds or Yen.)

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-37

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

<xsl:choose Element
<xsl:choose> <xsl:when test="testCondition"> <!-- ... other actions ... --> </xsl:when> <xsl:otherwise> <!-- ... alternative actions ... --> </xsl:otherwise> </xsl:choose> Multiple when tests can be implemented. The otherwise element is optional and must be the last child element of <xsl:choose> when present (used as a default if the other tests fail).

Copyright IBM Corporation 2004

Figure 9-32. <xsl:choose Element

XM3014.1

Notes:

9-38 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

<xsl:choose Example
<list> <book ID="666"> <chapter>First Chapter</chapter> <chapter>Second Chapter</chapter> <appendix>XSLT reference</appendix> </book> </list>

<p>Chapter: First Chapter</p> <p>Chapter: Second Chapter</p> <p>Appendix: XSLT reference</p>

Books.xml
<xsl:stylesheet xmlns:xsl='... <xsl:template match="//book"> <xsl:for-each select="*"> <p> <xsl:choose> <xsl:when test='name()="chapter"'>Chapter: </xsl:when> <xsl:when test='name()="appendix"'>Appendix: </xsl:when> <xsl:otherwise>Index: </xsl:otherwise> </xsl:choose> <xsl:value-of select="." /> </p> </xsl:for-each> </xsl:template> </xsl:stylesheet>

Books.html

Books.xsl
Copyright IBM Corporation 2004

Figure 9-33. <xsl:choose Example

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-39

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Elements to Generate Output (XML to XML)


Any and all XSLT elements can be used to transform from XML-to-HMTL or from XML-to-XML. So far we have focused on the elements commonly used when generating HTML. Now we will focus on some elements which are particularly useful when transforming from XML-to-XML. <xsl:element> <xsl:attribute> <xsl:copy> Copies the current nodes from source tree to result tree <xsl:processing-instruction> Add a processing instruction node

Copyright IBM Corporation 2004

Figure 9-34. Elements to Generate Output (XML to XML)

XM3014.1

Notes:
Very common use of XSLT is to translate and transform from one XML vocabulary to another XML vocabulary. XSLT provides some built in elements to help with these types of transformations. Remember: we are always building an output tree. Now we will look at ways to add nodes directly onto that output tree. <xsl:copy [use-attribute-sets]> ... </xsl:copy> Copies the current nodes from source tree to result tree Example <xsl:template match="list/book"> <xsl:copy> <xsl:apply-templates select="title"/> </xsl:copy>
9-40 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

</xsl:template> Outputting a processing instruction <xsl:pi name="WordPro">file="doc.lwp"</xsl:pi> <?WordPro file="doc.lwp"?>

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-41

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

<xsl:element Element
Creates an element in the result tree. Content (inside xsl:element) can be: <xsl:attribute> (create attribute) <xsl:element> (create child element) Text
<xsl:element name="element-name"> <!-- content: attributes, child elements, text --> </xsl:element>

Alternative: <element-name> literal result elements.


<element-name> <!-- content: attributes, child elements, text--> </element-name>

Copyright IBM Corporation 2004

Figure 9-35. <xsl:element Element

XM3014.1

Notes:
The attributes are always inside a element. Elements may be inside of elements (child element).

9-42 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

<xsl:attribute>
Creates an attribute in the result tree. Content is the text value for the attribute. All attributes must precede first child element. <xsl:attribute name="attribute-name"> <!-- content: text value --> </xsl:attribute> Example:
create an attribute named "id"

<xsl:attribute name="id"> <xsl:value-of select="@no"/> </xsl:attribute>


The value of the ID attribute is the value of attribute "no" of the current node.
Copyright IBM Corporation 2004

XPath: attribute "no" of current node

Figure 9-36. <xsl:attribute>

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-43

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XML to XML Example (1 of 2)


<company> <div no="7a"> <dept no="42"> <emp no="123456" name="Whoptimone, Ida"/> <emp no="651432" name="Tirebiter, George"/> </dept> <dept no="51"> <emp no="832953" name="Danger, Nick"/> </dept> </div> <div no="2b"> <dept no="57"> <emp no="283412" name="Boss, Yuda"/> </dept> </div> </company> input.xml <company> <employee id="123456"> <name>Whoptimone, Ida</name> <division>7a</division> <department>42</department> </employee> <employee id="651432"> <name>Tirebiter, George</name> <division>7a</division> <department>42</department> </employee> <employee id="832953"> <name>Danger, Nick</name> <division>7a</division> <department>51</department> </employee> <employee id="283412"> <name>Mann, Yuda</name> <division>2b</division> <department>57</department> </employee> </company>

output .xml
Copyright IBM Corporation 2004

Figure 9-37. XML to XML Example (1 of 2)

XM3014.1

Notes:
In this example, we want to transform the input.xml into the output.xml. What changed: Restructured. Was a company list by div, dept, & emp; now a company list by employee. Data from most attributes is now in sub-elements. Element and attribute names.

9-44 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XML to XML Example (2 of 2)


<company> <div no="7a"> <dept no="42"> <emp no="123456" name="Whoptimone, Ida"/> <emp no="651432" name="Tirebiter, George"/> </dept> <dept no="51"> <emp no="832953" name="Danger, Nick"/> </dept> </div> input.xml ... <company> <employee id="123456"> <name>Whoptimone, Ida</name> <division>7a</division> <department>42</department> </employee> <employee id="651432"> <name>Tirebiter, George</name> <division>7a</division> <department>42</department> </employee> ...

output .xml

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <xsl:element name="company"> <xsl:for-each select="//emp"> <xsl:element name="employee"> <xsl:attribute name="id"> <xsl:value-of select="@no"/> </xsl:attribute> <name><xsl:value-of select="@name"/></name> <xsl:element name="division"> <xsl:value-of select="../../@no"/> transform.xsl </xsl:element> ...

Copyright IBM Corporation 2004

Figure 9-38. XML to XML Example (2 of 2)

XM3014.1

Notes:
In this example, we want to transform the input.xml into the output.xml.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-45

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Numbers, Sorting, and Functions


<xsl:number> Used to allocate a sequential number to the current node Used to format a number <xsl:sort> Means to sort a collection of nodes in ascending or descending order Functions Built into XPath and/or XSLT Used in the XPath expressions

Copyright IBM Corporation 2004

Figure 9-39. Numbers, Sorting, and Functions

XM3014.1

Notes:
This number formatting is different then the function format-number() and <xsl:decimal-format>.

9-46 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Working with Numbering in XSLT


Emit number sequences with:

<xsl:number level="lvl" value="nbrExp" count="node" format="nbrFmt"/>


Emits numbers only, item content is emitted separately. The level attribute determines depth. The value attribute determines the numbering value. The count attribute determines which elements are counted. The format attribute determines how the numbers are formatted (see table, next slide). Number operations are applied to the result tree for the current node. If numbering is applied after a sort, use value="position()", otherwise the items may appear out of sequence.

Copyright IBM Corporation 2004

Figure 9-40. Working with Numbering in XSLT

XM3014.1

Notes:
The xsl:number element provides a function similar to a programming language's number function; it allows number formatting, establishment of boundaries, and other parameters. Numbering formats are covered further in the next slide, however, in the element sample shown, several optional attributes were not shown; they are: letter-value ... disambiguates between numbering sequences that use letters. In many languages there are two commonly used numbering sequences that use letters (that is, in English a, b, c, and so forth, and i, ii, iii). One numbering sequence assigns numeric values to letters in alphabetic sequence, and the other assigns numeric values to each letter in some other manner traditional in that language. In English, these would correspond to the numbering sequences specified by the format tokens a and i. In some languages, the first member of each sequence is the same, and so the format token alone would be ambiguous. A letter-value value of "alphabetic" specifies the alphabetic sequence; a value of "traditional" specifies the other sequence. If the letter-value attribute is not specified, then the XML application must resolve the ambiguity. grouping-separator ... specifies a character used between groups of digits.
Copyright IBM Corp. 2001, 2004 Unit 9. eXtensible Stylesheet Language: Transformations (XSLT) 9-47

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

lang ... specifies the xml:lang environment (a system environment that complies with the ISO language standard. American English, for example is EN, to specify British, EN-uk is used. grouping-size ... specifies the number of digits in each group. from ... used to set a limit to the level of ancestry that is searched, its value specifies an element name from which to start the count. The level attribute specifies what levels of the source tree to count, climbing the hierarchy to search for patterns to match against. The attribute takes one of these three values: single ... numbers the count attribute's matches that are siblings. multi ... numbers the count attribute's matches that are children of the current element's ancestors (but will not travel deeper than the current node.) any ... numbers any match of the count attribute anywhere in the document (but will not travel deeper than the current node.) There are other number possibilities available with the <xsl:number/> element, see http://www.w3.org/TR/xslt, section 7.7, for more information.

9-48 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

<xsl:number Element format Attribute Values


Attribute Value
1 A a i I &#x30A2; &#x30A4; &#x0E51; &#x05D0; &#x10D0; &#x03B1; &#x0430;

Description
Use standard numbers (1, 2, 3, 4 ... etc.) Use standard capital letters (A, B, C, etc.) Use standard lowercase letters (a, b, c, etc.) Use lowercase Roman numerals (i, ii, iii, iv, etc.) Use capital Roman numerals (I, II, III, IV, etc.) Use katakana numbering Use katakana number in iroha order Use Thai digits for numbering Use traditional Hebrew; letter-value value is "other" Use Gregorian; letter-value value is "other" Use Classical Greek; letter-value value is "other" Use Old Slavic; letter-value value is "other"

Example: <xsl:number format="A">


Copyright IBM Corporation 2004

Figure 9-41. <xsl:number Element format Attribute Values

XM3014.1

Notes:
The entity values shown are hexadecimal values from UTF-8, and are ISO standards. Other language-specific schemes may be supported as well; in addition, UTF-8 allows user-defined assignments. How this is supported in language-dependant numbering schemes is not clear; in such cases, it would probably be best to make it an XML-application issue and not an XSLT-processor issue.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-49

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

<xsl:number Example
<list> <book ID="666"> <chapter><title>Mission Statement</title></chapter> <chapter> <title>Organization</title> <section><title>SCM</title></section> <section><title>CRM</title></section> </chapter> <chapter> <title>Departments</title> <section><title>Executive</title></section> <section><title>Financial</title> <clause><title>Accounts Payable</title></clause> <clause><title>Accounts Receivable</title></clause> </section> </chapter> </book> Books.xml </list> <xsl:stylesheet version='1.0' xmlns:xsl='http:... <xsl:output method="text" /> <xsl:template match="/"> <xsl:for-each select="list/book//title"> <xsl:number level="multiple" format="1.A.a. " count="chapter | section | clause"/> <xsl:value-of select="."/> </xsl:for-each> </xsl:template> Books.xsl </xsl:stylesheet> 1. Mission Statement 2. Organization 2.A. SCM Result 2.B. CRM 3. Departments 3.A. Executive 3.B. Financial 3.B.a. Accounts Payable 3.B.b. Accounts Receivable

Copyright IBM Corporation 2004

Figure 9-42. <xsl:number Example

XM3014.1

Notes:
In this example, the two templates would number chapter elements. This is intended for a document that contains a sequence of chapters followed by a sequence of appendices, where both chapters and appendices contain sections, which in turn contain subsections. Chapters are numbered 1, 2, 3; Subchapters A,B,C and sub subchapters a,b,c.

9-50 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

<xsl:sort Element
<xsl:sort select="strExp" lang="nmtoken" data-type= {"text"|"number"|qname} order={"ascending"|"descending"} case-order={"upper-first"|"lower-first"} />

A means to sort a collection of nodes in ascending or descending order. Provides alphabetic or numeric sorting. Used within <xsl:apply-templates> or <xsl:for-each> Sort before numbering!

Copyright IBM Corporation 2004

Figure 9-43. <xsl:sort Element

XM3014.1

Notes:
Sorting is specified by adding xsl:sort elements as children of an xsl:apply-templates or xsl:for-each element. The first xsl:sort child specifies the primary sort key, the second xsl:sort child specifies the secondary sort key and so on. When an xsl:apply-templates or xsl:for-each element has one or more xsl:sort children, then instead of processing the selected nodes in document order, it sorts the nodes according to the specified sort keys and then processes them in sorted order. When used in xsl:for-each, xsl:sort elements must occur first. When a template is instantiated by xsl:apply-templates and xsl:for-each, the current node list collection consists of the complete list of nodes being processed in their sorted order. The select attribute value is an expression: for each node to be processed, the expression is evaluated with that node as the current node and with the complete list of nodes being processed in unsorted order as the current node list. The resulting object is converted to a string (as if by a call to the string function); this string is used as the sort key for that node collection. The default value of the select attribute is "." (self), which will cause the string-value of the current node to be used as the sort key.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-51

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

<xsl:sort Attributes
order (establishes the sorting order) Value: ascending | descending lang (language encoding, per xml:lang) Default: EN-us (is of type nmtoken) data-type (assigns data types to node strings) Value: text | number | QName case-order (sets precedence for upper or lower case) Value: upper-first | lower-first Default: lower-first (for US English, see notes)

Copyright IBM Corporation 2004

Figure 9-44. <xsl:sort Attributes

XM3014.1

Notes:
data-type: The value "text" does not imply a text string; number causes the sort to be evaluated by numeric equivalence; a qualified name (see Namespaces) may be used. As XML Schema is adopted, other data-types will be added (per W3C Note). case-order: upper-first gives a language precedence of A a B b, and so forth (in English), while lower first would be a A b B, and so forth. The default value is language dependent. A W3C note in section 10, Sorting, states "It is possible for two conforming XSLT processors not to sort exactly the same. Some XSLT processors may not support some languages. Furthermore, there may be variations possible in the sorting of any particular language that are not specified by the attributes on xsl:sort, for example, whether Hiragana or Katakana is sorted first in Japanese. Future versions of XSLT may provide additional attributes to provide control over these variations. Implementations may also use implementation-specific namespaced attributes on xsl:sort for this. It is recommended that implementors consult Unicode TR10 for information on internationalized sorting (see http://www.unicode.org/unicode/reports/tr10/index.html for details)."

9-52 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Sort Example
<List> <word id="Czech"/> <word id="czech"/> <word id="cook"/> <word id="Took"/> <word id="took"/> <word id="TooK"/> </List>

words.xml

<xsl:template match="/"> <table> <tbody> <xsl:for-each select="//word"> <xsl:sort select="."/> <tr><td> <xsl:value-of select="."/></td></tr> </xsl:for-each> </tbody> </table> </xsl:template>

<table> <tbody> <tr><td>cook</td></tr> <tr><td>czech</td></tr> <tr><td>Czech</td></tr> <tr><td>took</td></tr> <tr><td>Took</td></tr> <tr><td>TooK</td></tr> </tbody> </table>

HTML Output

sort.xsl
Copyright IBM Corporation 2004

Figure 9-45. Sort Example

XM3014.1

Notes:
Sort example where case-order is used to sort.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-53

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XPath/XSLT Functions
Category type conversion arithmetic string manipulation aggregation get node information Boolean get context information find nodes get processor information XPath boolean(),string(),... round(), ceiling(),... concat(), substring(),... count(), sum() local-name(), name() not(), false(), true() last(), position() (none) (none) XSLT format-number() (none) (none) (none) generate-id(), lang(), unparsed-entity-url() (none) current() document(), key(), id() element-available(), function-available(), system-property()

Copyright IBM Corporation 2004

Figure 9-46. XPath/XSLT Functions

XM3014.1

Notes:
Not all XPath functions are listed. Refer to XPath LO. List of function that can be used when transforming. document - finds an external document by resolving a URI reference key - used to find the nodes with a given value for a named key. Used in conjunction with <xsl:key> format-number - convert numbers into strings to display formatted current - returns the single current node unparsed-entity-uri - gives access to declarations of unparsed entities in the DTD of the source document generate-id - generates a unique id that identifies the node (might have a different result for each parser) system-property - returns information about the processing environment

9-54 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

element-available - checks if a particular XSLT instruction or element is available function-available - checks if a function is available, might be used to test if a certain extended function is available

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-55

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Other Elements
Attribute value templates Variable Parameter Not Covered Key <xsl:apply-imports> Applies imported stylesheet templates against the current node and its children

Copyright IBM Corporation 2004

Figure 9-47. Other Elements

XM3014.1

Notes:
<xsl:apply-imports> will generate an error condition if <xsl:import has not been declared and href'd at the beginning of the style document.

9-56 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Attribute Value Templates


Uses the curly brace characters "{" and "}" Match attribute values in result-tree elements Evaluated while processing Replaced immediately

Copyright IBM Corporation 2004

Figure 9-48. Attribute Value Templates

XM3014.1

Notes:
Basically a simplified use of <xsl:attribute.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-57

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Attribute Value Templates Example


<?xml version = "1.0" ?> <applet> <code class="javaApplet"/> <codebase>/src/code</codebase> </applet>

<xsl:stylesheet version='1.0' ... <xsl:template match="/"> <xsl:apply-templates select="applet" /> </xsl:template> <xsl:template match="applet"> <applet code="{code/@class}" codebase="{codebase}/java" /> </xsl:template> </xsl:stylesheet>

<applet code="javaApplet" codebase="/src/code/java" />

Copyright IBM Corporation 2004

Figure 9-49. Attribute Value Templates Example

XM3014.1

Notes:
In the above XSL template example, if the input XML is: <applet> <code class="javaApplet"/> </applet> The result of the template processing would be: <applet code="javaApplet" codebase="/src/code/java"/> Notice that the class attribute's value was immediately implemented in the output stream, as was the constant(codebase) value without further action statements being required in the template.

9-58 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XSLT Processors
Xalan - www.apache.org Was supplied to Apache by Lotus (LotusXSL) MSXSL - Microsoft Internet Explorer 5.x, 6 Command line XT (by James Clark, now fading away) SAXON (from Michael Kay, author of XSLT Programmer's Reference) For a more extensive list: http://www.w3.org/Style/XSL/

Copyright IBM Corporation 2004

Figure 9-50. XSLT Processors

XM3014.1

Notes:
List of main XSL processors.

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-59

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Xalan
Named after a rare Persian musical instrument. http://www.apache.org Java version requires JDK/JRE 1.1.8, 1.2.2, or 1.3: Comes with Xerces as XML parser
Can use other XML parsers

Now included in JDK/JRE 1.4 C++ version available. XSLTC compiler generates a translet that generally provides better performance.

Copyright IBM Corporation 2004

Figure 9-51. Xalan

XM3014.1

Notes:
For more information on XALAN http://www.apache.org/ Xalan includes XSLTC a stylesheet compiler that was donated by Sun Microsystems. Normally XSLT stylesheet are interpreted each time they are used. XSLTC compiles the stylesheet into set of Java classes. This can speed up stylesheet processing fairly dramatically. XSLTC is not quite XSLT 1.0 compliant but steady progress towards compliance is being made.

9-60 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

XSL Resources from IBM


WebSphere Studio 4+ (all editions) ibm.com/alphaworks XML processors, tools, editors, diff tools XSLT processor, trace XSL Trace Visual Transformation IBM XSL Editor
http://www.alphaworks.ibm.com/tech/xsleditor

ibm.com/developerWorks/xml Articles, tutorials, source code Several tutorials on XSL


www6.software.ibm.com/reg/xml/transformxml-i

ibm.com/developerWorks/speakers/colan XSL by Example presentation, companion files Other presentations on XML and Web Services

Copyright IBM Corporation 2004

Figure 9-52. XSL Resources from IBM

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-61

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XSL References
Reference
http://www.w3.org/Style/XSL/
http://www.cranesoftwrights.com/ training/#ptux

Description
W3C Specifications (XSL, XSLT, XPath) Practical Transformation Using XSLT and XPath training materials, Ken Holman, Crane Softwrights Ltd (Mailing list for general XSL questions) Various XSL-specific references Chapter 17 of XML Bible, Elliotte Rusty Harold, IDG Books

www.mulberrytech.com/xsl/xsl-list/ www.xslt.com http://www.ibiblio.org/xml/books/ bible2/chapters/ch17.html

XSLT Programmer's Reference, Michael Kay, Wrox Press XSLT in a Nutshell, Doug Tidwell, O'Reilly XSL Companion, Neil Bradley, Addison-Wesley
Copyright IBM Corporation 2004

Figure 9-53. XSL References

XM3014.1

Notes:

9-62 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

Uempty

Checkpoint Questions
1. How can XML documents be transformed? A. XPath B. XSLT C. Notepad D. Xatran 2. Is an XSL Stylesheet an XML document? A. Yes B. No C. Depends on the header D. Only if it is applied to a XML document 3. What template would you use for extracting a specific value from the source tree? A. <xsl:choose... B. <xsl:copy ... C. <xsl:value-of select=... D. <xsl:text>
Copyright IBM Corporation 2004

Figure 9-54. Checkpoint Questions

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Unit 9. eXtensible Stylesheet Language: Transformations (XSLT)

9-63

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Unit Summary
In this unit we learned: The XSL/XSLT model and concepts Transformation XSL templates and pattern matching XSL elements and their attributes How to create simple XSL style sheets Some XSL Best Practices XSL Tools

Copyright IBM Corporation 2004

Figure 9-55. Unit Summary

XM3014.1

Notes:

9-64 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Appendix A. Introduction to Databases and XML


What This Unit is About
In this unit, you are introduced to the different approaches you can take with XML to work with databases.

What You Should Be Able to Do


After completing this unit, you should be able to: List the characteristics of an XML document that help determine the right type of database Introduce content management databases Compare relational database structures to XML document structures List the limitations of relational data tables with structured data Define and describe what object-oriented databases provide. Describe the status of XML-based queries

How You Will Check Your Progress


Accountability: Checkpoint Machine exercises

Copyright IBM Corp. 2001, 2004

Appendix A. Introduction to Databases and XML

A-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Unit Objectives
After completing this unit, you should be able to: List the characteristics of an XML document that help determine the right type of database Define and describe content management databases Compare relational database structures to XML document structures List the limitations of relational data tables with structured data Define and describe what Object-Oriented databases provide Describe the status of XML-based queries

Copyright IBM Corporation 2004

Figure A-1. Unit Objectives

XM3014.1

Notes:
Currently, the database implementation of XML has lagged at the W3C. Most relational database management systems (RDBMS) use some form of filtering or mapping to deal with XML data.

A-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Considerations
Start with what you are using the database for. What type of application are you supporting? Is XML being used as a transport between the database and the application? Are you using legacy data? Are you more interested in the data or in the document structure? Are you storing Web pages or Web pages' content? Is your data used by other, perhaps non-XML, applications? Are you updating the DB from XML?

Consider whether your XML document is more data-centric or document-centric.


Copyright IBM Corporation 2004

Figure A-2. Considerations

XM3014.1

Notes:
These are questions to think about as you start evaluating what database and how to use your database to support your business goals.

Copyright IBM Corp. 2001, 2004

Appendix A. Introduction to Databases and XML

A-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Data-centric versus Document-centric


Data-centric Highly regular structure Fine grained data Order of elements not significant Little or no mixed content XML being used as data transport Often designed for machine consumption Legacy data Document-centric Less regular structure Larger grained data Extensive prose Mixed content Order significant (especially for siblings) Used for human consumption

Not all documents are data or document. May be a combination of both.


Copyright IBM Corporation 2004

Figure A-3. Data-centric versus Document-centric

XM3014.1

Notes:
Examples of Data-centric are many e-commerce applications and almost all B2B applications. Examples of Document-centric are publishing applications. An example of a mixed application would be a store selling books where the information about the shopping cart is very data-oriented, but the information about the book or reviews is very content oriented.

A-4

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Types of Databases
Remember: An XML document is a hierarchical, ordered, and untyped document.
Relational Database (RDB) structures are not hierarchical. Much of the world's current data exists in RDBs. Object-Oriented Databases (OODB) are slow to catch on, but show promise of storing XML data objects. Existing OODBs may have complex relationships. Native XML database or content-management systems are designed specifically to store XML. Oriented towards the document-oriented XML systems. Existing database systems must use some type of attachment or filter to deal with XML data. Many RDB vendors are building this capability into their products.

Copyright IBM Corporation 2004

Figure A-4. Types of Databases

XM3014.1

Notes:
Databases and XML Relational Databases - Deal with structures in rows and columns. While in a simple database model, it would be easy to map XML structures to match, a problem occurs when a field in the database is related to another row/column structure. While there have been many approaches by database vendors, the incorporation of XML (structured) data is unique to each RDBMS vendor. Object-Oriented Databases - Deal with object relationships rather than the typical row/column approach of RDBs. While the concept of such databases has picked the interest of database engineers, it has been slow to catch on in actual usage. OODBs are usually chosen when there are complex relationships in the data which would be difficult to support in an RDB. These complex relationships are also likely to be difficult to map to XML's hierarchical structure. Native XML database/content management systems make storing and retrieving XML very easy, but will not easily support non-XML oriented applications.

Copyright IBM Corp. 2001, 2004

Appendix A. Introduction to Databases and XML

A-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Typical Java-based Database Interchange Solution


Java Application
Connect to Database using JDBC Retrieve an SQL result set (text string) containing the XML document

Use the XML parser to create XML DOM from the SQL result set Use the XSLT processor to extract the required element from the XML source tree

XML Apps

XML editing application to update the element contents of the result DOM

Insert updated element object into source DOM Use XSLT to write DOM to SQL Parse XML docs with SAX

Database back end

Use JDBC to update the database record

Copyright IBM Corporation 2004

Figure A-5. Typical Java-based Database Interchange Solution

XM3014.1

Notes:
This example describes how a Java-based application may access an XML document stored in a database table, and then update an element's contents. The main disadvantage of this approach is that XML documents must be extracted then manipulated outside the database by the application (note that this can hinder performance greatly), and then be written back. Note that the above example does not include document validation. In effect, middleware solutions such as XML Extender enable databases to become an XML repository, where many of the above problems are overcome.

A-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Challenges Mapping to RDB or OODB


Element and Attribute Mapping Mapping complex relationships from RDB and OODB into the hierarchical XML structure. Extracting tabular material from multiple tables into the XML document. Validation Character encoding Conversion from text to data types Null data Binary data Storage of processing instructions and comments Storing markup

Copyright IBM Corporation 2004

Figure A-6. Challenges Mapping to RDB or OODB

XM3014.1

Notes:
Many of these challenges do not apply to the case where you are storing XML documents into a single column in an RDB. The challenges that still apply are: Character encoding Validation Binary Data Null Data: - In RDBs null data exist and are different than considering them as 0 (zero). They simply don't exist. Storing Markup <description> <b>Confusing example:</b> &lt;foo/&gt; </description>

Copyright IBM Corp. 2001, 2004

Appendix A. Introduction to Databases and XML

A-7

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XML to RDB - Where Should the Data Go?


<department> <dept-nbr>X333</dept-nbr> <department-name> XML developers </department-name> <employee> <last>Smith</last> <first>John</first> <ID>250243</ID> <dept-nbr>X333</dept-nbr> </employee> <employee> <last>Adams</last> <first>Tom</first> <ID>432453</ID> <dept-nbr>X333</dept-nbr> </employee> </department>

Department
dept-nbr X333 Z568 ... department-name XML developers Human Resources

Employee
ID last 250243 Smith ... first John dept-nbr X333 X333 ... ...

432453 Adams Tom

OR
XML_Table
Key XML_Doc X333 <department> <dept-nbr>X333</dept-nbr> <department-name>XML developers</department-name> <employee> <last>Smith</last> ... Z568 <department> <dept-nbr>Z568 ....
Copyright IBM Corporation 2004

Figure A-7. XML to RDB - Where Should the Data Go?

XM3014.1

Notes:
This example shows the two major options for decomposing an XML document into a relational database. In one case, 2 tables are created to store the information with the parent element becoming the table name and the child elements mapped to columns. Also all the information for both the department and employee could be stored a a single table. Results in many columns with null values. Another option would be to store the XML document as a CLOB without decomposing it into a relational tables and providing a 'id' based lookup.

A-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

XML to RDB - And If We Have Attributes?


<department deptnbr="X333"> <department-name> XML developers</department-name>

Department
dept-nbr X333 Z568 ... department-name XML developers Human Resources

<employee ID="250243" deptnbr="X333"> <last>Smith</last> <first>John</first> <phone>533-4333</phone> <e-mail>[email protected]</e-mail> </employee>

Employee
ID last 250243 Smith first John dept-nbr X333 X333 ... ...

432453 Adams Tom ...

OR
<employeeID="432453" deptnbr="X333"> <last>Adams</last> <first>Tom</first> <phone>544-4444</phone> <e-mail>[email protected]</e-mail> </employee> </department>

XML_Table
Key XML_Doc <department deptnbr="X333"> <department-name> X333 XML developers</department-name> <employee ID="250243" deptnbr="X333">... <department deptnbr="Z568"> Z568 <department-name>...
XM3014.1

Copyright IBM Corporation 2004

Figure A-8. XML to RDB - And If We Have Attributes?

Notes:
This example shows the same content of the previous example but now using attributes to store information.

Copyright IBM Corp. 2001, 2004

Appendix A. Introduction to Databases and XML

A-9

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Storing XML in an RDB without Mapping


XML documents are typically stored as large text strings. Data types (actual names can be RDBM-specific) include: VARCHAR (Textual character data) CLOB (Single-Byte Character Large Object) DBCLOB (Double-Byte Character Large Object) No distinction between an XML document and traditional SQL data. No facility for accessing XML elements and attributes. No validation of XML documents on insert or update.

Copyright IBM Corporation 2004

Figure A-9. Storing XML in an RDB without Mapping

XM3014.1

Notes:
RDB with no XML support. By storing XML documents in relational databases there is always the challenge of how to validate documents. Retrieval and updates might be resource consuming due to the lack functions.

A-10 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Object-Oriented Databases
Object-Oriented Database (OODB) features: Persistence of objects. Extend semantics of O-O programming languages. Unification of data model and database structure. Requires less code. Ease of code base maintenance. Relational Database (RDB) comparison: Data structures must be flattened to fit joined tables. Structures maintained in memory. No built-in object management. OODB real-world applications: Risk analysis systems, telecom systems, WWW document structures, design and manufacturing systems, hospital patient record systems with complex data interrelationships.

Copyright IBM Corporation 2004

Figure A-10. Object-Oriented Databases

XM3014.1

Notes:
For more information and links to OODB resources, see http://www.objenv.com/cetus/oo_data_bases.html.

Copyright IBM Corp. 2001, 2004

Appendix A. Introduction to Databases and XML

A-11

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XML Native DB/Content-management Systems (1 of 2)


Can be good for document-oriented XML, much less useful for data-oriented XML. Good for less structured data which could result in many null columns in an RDB. Preserves physical structure of document. No need for schema or DTD. Limited to XML interfaces. Do not use for data serving a variety of applications. And provide very fast retrieval speed for entire documents. Search for specific views of data likely to be slower then RDB. Can only return data as XML.

Copyright IBM Corporation 2004

Figure A-11. XML Native DB/Content-management Systems (1 of 2)

XM3014.1

Notes:
No need to compose or decompose the XML document into columns into the database. Since data is not decomposed in columns, search for specific views have to go though each document.

A-12 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

XML Native DB/Content-management Systems (2 of 2)


Two main categories: Text-based storage
Store the entire document. Provide limited DB function against the document. Provide an exact roundtrip of the document.

Model-based storage
Store a DOM presentation of the XML document into an existing or custom data store. May use an RDB underneath. Roundtrip at the level of the underlying model (can maintain order).

Copyright IBM Corporation 2004

Figure A-12. XML Native DB/Content-management Systems (2 of 2)

XM3014.1

Notes:
Structure of the document remains unchanged.

Copyright IBM Corp. 2001, 2004

Appendix A. Introduction to Databases and XML

A-13

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

What about Character Encoding?


Different languages SBCS DBCS UTF-8 Unicode ASC II EBCDIC CCSID
Convert? Loss of information?

Database

Copyright IBM Corporation 2004

Figure A-13. What about Character Encoding?

XM3014.1

Notes:
One of the main issues when working with XML documents and databases is it's character encoding. The document may be in one encoding and the database in a different one. Many databases do not support Unicode and require special setup for non0ASCII characters. There is no general way to solve this problem. You must be aware of it and address it on a case by case basis. DBCS - double-byte character set (ex: Chinese) SBCS - single-byte character set Unicode- A character coding system designed to support the interchange, processing, and display of the written texts of the diverse languages of the modern world. Unicode characters are normally encoded using 16-bit integral unsigned numbers. UCS - Universal Multiple Octet Coded Character Set UTF-8 - UCS Transformation Format.

A-14 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

RDB to XML Middleware: DXX


There is help out there
A variety of middleware products exist to help you map between your RDB and XML. Middleware Example (DB2 with XML Extender): XML Extender helps integrate the abilities of DB2 with the flexibility of XML. XML data can be combined with traditional relational data. XML Extender provides the ability to search XML documents based on XML element or attribute values, in addition to structural text searching. Add-on to DB2 (no charge).

Copyright IBM Corporation 2004

Figure A-14. RDB to XML Middleware: DXX

XM3014.1

Notes:
DB2 and XML Extender: DXX DB2 XML Extender provides a range of functionality for managing XML documents using traditional and nontraditional data. Some of the areas of functionality that the XML extender provides includes facility for storage, fast searching, validation and composition/decomposition of XML documents.

Copyright IBM Corp. 2001, 2004

Appendix A. Introduction to Databases and XML

A-15

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Other Uses of XML with Databases


Describe databases: Tables Columns Foreign Keys And so forth Database exchange format. Format for loading a database. Example, different vendors Represent result sets from database queries and updates for programs and for humans. Storing XML documents in databases as objects.

Copyright IBM Corporation 2004

Figure A-15. Other Uses of XML with Databases

XM3014.1

Notes:
XML documents can be used to describe database schemas, including columns, foreign keys, constraints, and so forth. XML documents may be used to exchange data between different database vendor by being an vendor independent way to store data. Exchanging data between system/vendors can be very useful. This could be accomplished, for example, by extracting schema information and then the data. This could be in one or more XML files. The procedure above can be used to export and then load data into a database.

A-16 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Meta Data Aspect


XML DTDs/Schema are the metadata for an XML document. The DB schema is the meta data for the database. Given the meta data for one system, you can create the meta data for the other system. Decide on mapping standards
Column = child element Type mapping in DB to restrictions in Schema Non-nullable columns to required elements

Probably not optimal schema for the other system, but good starting point or good enough for your use. Likely to ease mapping between the DB and XML documents. Design time, not run time, activity.

Copyright IBM Corporation 2004

Figure A-16. Meta Data Aspect

XM3014.1

Notes:
Working through meta data of one system may ease your way of working for the other system.

Copyright IBM Corp. 2001, 2004

Appendix A. Introduction to Databases and XML

A-17

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Database Schemas Example


<element name="EMPLOYEE" type="ilscs01:EMPLOYEE"> <key name="EMPLOYEEPRIMKEY"> <selector xpath="ilscs01:EMPLOYEE"/> <field xpath="EMP_NBR"/> </key> </element> <complexType name="EMPLOYEE"> <sequence> <element name="EMP_NBR"> <simpleType> <restriction base="string"> <length value="10"/> </restriction> </simpleType> </element> <element name="DEPT_NBR"> <simpleType> <restriction base="string"> <length value="6"/> </restriction> </simpleType> </element> ... </sequence> </complexType> Database Schema Example ...

CREATE TABLE employee (emp_nbr Char(10) NOT NULL PRIMARY KEY, dept_nbr Char(6), type Varchar(40), last Varchar(40), first Varchar(40));

Database Create Table

Copyright IBM Corporation 2004

Figure A-17. Database Schemas Example

XM3014.1

Notes:
This example shows how a database table can be represented by an XML document. In this example we show how database columns and keys can be described using XML. Note that this could be accomplished in different ways.

A-18 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

XML Query Language (XQuery) (1 of 2)


Goals "The goal of the XML Query WG is to produce a data model for XML documents, a set of query operators on that data model, and a query language based on these query operators." XQuery consists of six Working Drafts: XQuery Requirements XQuery Use Cases XQuery 1.0 and XPath 2.0 Data Model XQuery 1.0 Formal Semantics XQuery 1.0: An XML Query Language XML Syntax for XQuery 1.0 (XQueryX)

Copyright IBM Corporation 2004

Figure A-18. XML Query Language (XQuery) (1 of 2)

XM3014.1

Notes:
The XQuery 1.0 and XPath 2.0 Data Model is currently a Working Draft of the W3C, June 7, 2001; the Algebra and Language specifics have yet to be addressed by the W3C working group. For current information on the Data Model and updates on the other specifications, see http://www.w3.org/XML/Query. The following areas are addressed in the XQuery Requirements specification, a Working Draft (June 2001), of the W3C. For more information on the XQuery Requirements, see http://www.w3.org/TR/xmlquery-req.

Copyright IBM Corp. 2001, 2004

Appendix A. Introduction to Databases and XML

A-19

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

XML Query Language (XQuery) (2 of 2)


XQuery Goals/Usage Scenarios Human-readable documents Data-oriented documents Mixed model documents Administrative data Stream filtering DOM queries Native XML repositories Catalog search Multiple syntactic environments

Copyright IBM Corporation 2004

Figure A-19. XML Query Language (XQuery) (2 of 2)

XM3014.1

Notes:
Human-readable documents Perform queries on structured documents and collections of documents, such as technical manuals, to retrieve individual documents, to generate tables of contents, to search for information in structures found within a document, or to generate new documents as the result of a query. Data-oriented documents Perform queries on the XML representation of database data, object data, or other traditional data sources to extract data from these sources, to transform data into new XML representations, or to integrate data from multiple heterogeneous data sources. The XML representation of data sources may be either physical or virtual; that is, data may be physically encoded in XML, or an XML representation of the data may be produced.

A-20 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Mixed model documents Perform both document-oriented and data-oriented queries on documents with embedded data, such as catalogs, patient health records, employment records, or business analysis documents. Administrative data Perform queries on configuration files, user profiles, or administrative logs represented in XML. Stream filtering Perform queries on streams of XML data to process the data in a manner analogous to UNIX filters. This might be used to process logs of e-mail messages, network packets, stock market data, newswire feeds, EDI, or weather data to filter and route messages represented in XML, to extract data from XML streams, or to transform data in XML streams. DOM queries Perform queries on DOM structures to return sets of nodes that meet the specified criteria. Native XML repositories Perform queries on collections of documents managed by native XML repositories or web servers. Catalog search Perform queries to search catalogs that describe document servers, document types, XML schemas, or documents. Such catalogs may be combined to support search among multiple servers. A document-retrieval system could use queries to allow the user to select server catalogs, represented in XML, by the information provided by the servers, by access cost, or by authorization. Once a server is selected, a retrieval system could query the kinds of documents found on the server and allow the user to query those documents. Multiple syntactic environments Queries may be used in many environments. For example, a query might be embedded in a URL, an XML page, or a JSP or ASP page; represented by a string in a program written in a general-purpose programming language; provided as an argument on the command-line or standard input; or supported by a protocol, such as DASL or Z39.50.

Copyright IBM Corp. 2001, 2004

Appendix A. Introduction to Databases and XML

A-21

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

More Information
Reference
http://www-106.ibm.com/ developerworks/xml/library/ x-matters8/index.html http://www.rpbourret.com/xml/ XMLAndDatabases.htm#intro

Description
Putting XML in context with hierarchical, relational, and object-oriented models by David Mertz XML and Databases by Ronald Bourret

http://www.rpbourret.com/xml/ XML Database products by XMLDatabaseProds.htm#xmlservers Ronald Bourret http://www-106.ibm.com/ developerworks/library/x-struct/ http://www.xml.com/pub/a/2001/05/ 09/dtdtodbs.html XML Structures for Existing Databases by Kevin Williams and others Mapping DTDs to Databases by Ronald Bourret

Copyright IBM Corporation 2004

Figure A-20. More Information

XM3014.1

Notes:

A-22 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Checkpoint Questions (1 of 2)
1. How can an XML document be stored in an RDB? (select all that apply): A. In a Table column (CLOB) B. SGML C. Decomposed into different columns/tables D. Into a DTD file E. Compressed into an integer column 2. While RDBs are row-based XML documents are: A. Record based B. Hierarchical C. Obsolete D. Rectangular

Copyright IBM Corporation 2004

Figure A-21. Checkpoint Questions (1 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Appendix A. Introduction to Databases and XML

A-23

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Checkpoint Questions (2 of 2)
3. I should use an RDB to store my XML if: (select all that apply) A. I have lots of proprietary file formats B. I need to retrieve large number of documents based on a specific element C. I need to exchange data with a business partner D. I need to represent my data in Esperanto

Copyright IBM Corporation 2004

Figure A-22. Checkpoint Questions (2 of 2)

XM3014.1

Notes:

A-24 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Unit Summary
In this unit we learned: How to compare relational database structures to XML document structures. The limitations of relational data tables with structured data. What Object-Oriented databases provide. The status of XML-based queries.

Copyright IBM Corporation 2004

Figure A-23. Unit Summary

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Appendix A. Introduction to Databases and XML

A-25

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

A-26 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Appendix B. Additional Information for XML Schema

Copyright IBM Corp. 2001, 2004

Appendix B. Additional Information for XML Schema

B-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

"Table 2. Simple Types Built In to XML Schema"*


Simple Type string normalizedString token byte unsignedByte base64Binary hexBinary integer ... Examples (delimited by commas) Confirm this is electric Confirm this is electric Confirm this is electric -1, 126 0, 126 GpM7 0FB7 -126789, -1, 0, 1, 126789 Notes see (3) see (4) see (2) see (2)

see (2)

Notes: (1) To retain compatibility between XML Schema and XML 1.0 DTDs, the simple types ID, IDREF, IDREFS, ENTITY, ENTITIES, NOTATION, NMTOKEN, NMTOKENS should only be used in attributes. (2) A value of this type can be represented by more than one lexical format, e.g. 100 and 1.0E2 are both valid float formats representing "one hundred". However, rules have been established for this type that define a canonical lexical format, see XML Schema Part 2. (3) Newline, tab and carriage-return characters in a normalizedString type are converted to space characters before schema processing. (4) As normalizedString, and adjacent space characters are collapsed to a single space character, and leading and trailing spaces are removed. (5) The "g" prefix signals time periods in the Gregorian calender. otes: (1) To retain compatibility between XML Schema and XML 1.0 DTDs, the simple types ID, IDREF, IDREFS, ENTITY, ENTITIES, NOTATION, NMTOKEN, NMTOKENS should only be used in attributes. (2) A value of this type can be represented by more than one lexical format, e.g. 100 and 1.0E2 are both valid float formats representing "one hundred". However, rules have been established for this type that define a canonical lexical format, see XML Schema Part 2. (3) Newline, tab and carriage-return characters in a normalizedString type are converted to space characters before schema processing. (4) As normalizedString, and adjacent space characters are collapsed to a single space character, and leading and trailing spaces are removed. (5) The "g" prefix signals time periods in the Gregorian calender.

*Excerpt from table of the same name in the 20010502 Primer.


Copyright IBM Corporation 2004

Figure B-1. "Table 2. Simple Types Built In to XML Schema"*

XM3014.1

Notes:
Always refer to the current release of the Specification and associated Primer(s), if any, for normative use of schema components.

B-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Pattern Facet
<schema xmlns='http://www.w3.org/2001/XMLSchema'> <simpleType name='DNAType'> <restriction base='string'> <pattern value='(A|G|T|C)+'/> </restriction> </simpleType> <element name='DNA' type='DNAType/> </schema> Valid XML: <DNA>AGCTATATACGGTAACGTA</DNA> Invalid XML: <DNA>AGTBCTGAEC</DNA> <DNA>10100011011</DNA>"

Copyright IBM Corporation 2004

Figure B-2. Pattern Facet

XM3014.1

Notes:
The pattern facet allows creation of a restriction of the string simple type by the specification of a regular expression. A regular expression specifies a set of strings using a pattern. Only string patterns that match the regular expression are valid instances of that data type. The value attribute of the <pattern> facet tag holds the regular expression. The regular expression syntax used by XML Schema is based on Perl regular expressions, but is not identical; the syntax has been extended to cope with Unicode characters and expressions on Unicode strings. In this example we're defining the new type 'DNAType" as a restriction of the string type. We're using the pattern facet as the constraint here. The value attribute of the <pattern> element is the regular expression (A|G|T|C)+. The meaning of this regular expression is "One or more occurrences (denoted by the + at the end) of A or (denoted by the vertical bar '|') G or T or C". There is a detailed explanation of the Schema regular expression language in Part 2 of the XML Schema specification at http://www.w3.org/TR/xmlschema-2.

Copyright IBM Corp. 2001, 2004

Appendix B. Additional Information for XML Schema

B-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

List Simple Type


Allows definition of a simple type as a sequence of values from a single simple type. This type must allow whitespace in its lexical space. The list is an aggregate of simple type values. The elements of a list are separated by whitespace. Built in list types (needed for XML 1.0)

NMTOKENS IDREFS ENTITIES

Copyright IBM Corporation 2004

Figure B-3. List Simple Type

XM3014.1

Notes:
All the simple types we've seen so far are 'scalar' types. That is, they are not decomposable into any smaller units. The next set of simple types we will look at are the aggregate data types, which can be broken down into smaller units. The first of these types is the list type. The list type allows us to define a simple type that contains a list of values. These values must be drawn from a single simple type, and the lexical space of that type must allow whitespace. The values in the list will be separated at whitespace boundaries (that is, wherever there is whitespace). Built in list types XML Schema contains a number of built in list types. All of these list types correspond to lists that were found in XML 1.0 DTDs. The full list of the built in list types is NMTOKENS, IDREFS, ENTITIES.

B-4

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

List Simple Type Example


We will use the DNAType from a previous example to declare a new element called 'DNASamples', which is a list of type 'DNAType'. Declaration <schema xmlns='http://www.w3.org/2001/XMLSchema'> <element name='DNASamples'> <simpleType> <list itemType='DNAType'/> </simpleType> </element> </schema> Valid XML: <DNASamples>AGCTA TATAC GGTAA CGTA</DNASamples> Invalid XML: </DNASamples>AGCTA:TATC:GATTA</DNASamples> </DNASamples>AGCTA,TATC,GATTA</DNASamples>

Copyright IBM Corporation 2004

Figure B-4. List Simple Type Example

XM3014.1

Notes:
Here is a schema that shows how to declare a list type. We're going to use the 'DNAType' simpleType from the earlier visual. This schema declares a new element called 'DNASamples' which is a list of type 'dnaType'. The elements of a list are separated by whitespace. Note that this simpleType definition is not a restriction of any other base type. The invalid samples are invalid because they use something other than whitespace to separate list elements.

Copyright IBM Corp. 2001, 2004

Appendix B. Additional Information for XML Schema

B-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Union Simple Type (1 of 2)


Allows you to create a new simple type whose instances must match the rules for one of the member simple types which is specified in the union. Example: Union type creates an interval with a "hole" Schema
<schema xmlns='http://www.w3.org/2001/XMLSchema'> <simpleType name='biggerThanFive'> <restriction base='integer'> <minExclusive value='5'/> </restriction> </simpleType> <simpleType name='lessThanZero'> <restriction base='integer'> <maxExclusive value='0'/> </restriction> </simpleType> <element name='interval'> <simpleType> <union memberTypes='biggerThanFive lessThanZero'/> </simpleType> </element> </schema>
Copyright IBM Corporation 2004

Figure B-5. Union Simple Type (1 of 2)

XM3014.1

Notes:
The union simple type doesn't correspond to any type in XML 1.0 DTDs. The union type allows you to create a new simple type whose instances must match the rules for one of the member simple types which is specified in the union. The union type in this creates an interval with a "hole". In this example, we define a new restriction of the integer type, which will have all the integer greater than five. Notice the use of the minExclusive facet to exclude the integer 5 from the value space of the new type. We then define another restriction of integers which consists of all the integers less than zero. Note again the use of maxExclusive to exclude 0 from the value space of the new type. We can now create a union type which will include both 'biggerThanFive' and 'lessThanZero', leaving a hole in the integers consisting of 0,1,2,3,4, and 5. Union type definitions are not confined to restrictions of the same base type. We could have also included a restriction of string in the union type, if we so desired.

B-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Union Simple Type (2 of 2)


Valid XML: <interval>-1</interval> <interval>8</interval> Invalid XML: <interval>0</interval> <interval>4</interval>

Copyright IBM Corporation 2004

Figure B-6. Union Simple Type (2 of 2)

XM3014.1

Notes:
This visual shows valid and invalid values for the resulting union type.

Copyright IBM Corp. 2001, 2004

Appendix B. Additional Information for XML Schema

B-7

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Attribute Groups
Definition and use of an attribute group
<schema xmlns='http://www.w3.org/2001/XMLSchema'> <attributeGroup name='range'> <attribute name='minimum' type='integer' use='required'/> <attribute name='maximum' type='integer' use='required'/> </attributeGroup> <element name='gauge'> <complexType> <sequence> <element name='label' type='string'/> </sequence> <attributeGroup ref='range'/> </complexType> </element> </schema>

Valid XML:
<gauge minumum='0' maximum='90'> <label>Speed</label> </gauge>

Invalid XML:
<gauge minumum='0'> <label>Pressure</label> </gauge>
Figure B-7. Attribute Groups
Copyright IBM Corporation 2004

XM3014.1

Notes:
The attribute group component is relatively straightforward. The start and end tags for attributeGroup bracket the set of attribute group declarations that are to make up the group. The start tag provides an attribute for naming the attribute group. To actually use the attribute group, a complex type definition includes an attributeGroup element with a ref attribute that names the attribute group to be used.

B-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Annotation
Can be applied to elements, attributes, groups, attribute groups, simple types, complex types, wildcards. <schema xmlns='http://www.w3.org/2001/XMLSchema xmlns:ibmls='http://www.ibm.com/LS/types> <element name='widget' type='ibmls:widgetType'> <annotation> <documentation> This element is directly serializable into a Java class. </documentation> <appinfo> <java-serializer> com.ibm.ils.WD03.WidgetSerializer </java-serializer> </appinfo> </annotation> </element> </schema>
Copyright IBM Corporation 2004

Figure B-8. Annotation

XM3014.1

Notes:
XML Schema provides the <annotation> element for adding information about the schema components in an XML Schema document. The <annotation> element can have two children elements. The <documentation> element allows the author of a schema document to add human readable documentation to the component. The <appinfo> element allows annotations that are directed at computer programs that may process the schema. This may be a schema validator, or it may be another program. In the example on this visual, we are annotation an element declaration, and providing some human readable documentation. We also provide some information to a program that can take the schema information and use it to serialize and serialize the element as a Java class. Here our <appinfo> provides the name of a Java class that can do the serialization. This <appinfo> information can be processed by an application that takes the schema file and looks for the <appinfo> element associated with a particular component. The <annotation> element can be applied to elements, attributes, groups, attribute groups, simple types, complex types, wildcards, and is typically the first element to appear in the definition of one of these components.
Copyright IBM Corp. 2001, 2004 Appendix B. Additional Information for XML Schema B-9

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Terms (1 of 6)
abstract - must be implemented. anySimpleType - A conceptual datatype; the simple version of the ur-type definition from XML Schema Part 1: Structures. anySimpleType can be considered as the base type of all primitive types. The value space of anySimpleType can be considered to be the union of the value spaces of all primitive datatypes. anyType - see anySimpleType above and ur-type definition. assessment - Used to refer to the overall process of local validation, schema-validity assessment and infoset augmentation. attributeFormDefault atomic datatypes - Datatypes having values that are regarded as being indivisible. base type - Every datatype that is derived by restriction is defined in terms of an existing datatype, referred to as its base type. "base types" can be either primitive or derived. base type definition - A type definition used as the basis for an extension or restriction is known as the base type definition of that definition. blockDefault complexContent - contains only elements complexType element - These may allow elements; they may carry attributes. constraining facet - An optional property that can be applied to a datatype to constrain its value space. AKA non-fundamental facet.
Copyright IBM Corporation 2004

Figure B-9. Terms (1 of 6)

XM3014.1

Notes:
See the specifications for additional information. This material is provided as an aid only. AKA is also-known-as.

B-10 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Terms (2 of 6)
datatype - A 3-tuple, consisting of a) a set of distinct values, called its value space, b) a set of lexical representations, called its lexical space, and c) a set of facets that characterize properties of the value space, individual values or lexical items. declarations - Enable elements and attributes to appear in document instances; both simple and complex. definitions - Create new types; both simple and complex. derived datatypes - Those that are defined in terms of other datatypes. derived by list - A list datatype can be derived from another datatype (its itemType) by creating a value space that consists of a finite-length sequence of values of its itemType. derived by restriction - A datatype is said to be derived by restriction from another datatype when values for zero or more constraining facets are specified that serve to constrain its value space and/or its lexical space to a subset of those of its base type. derived by union - One datatype can be derived from one or more datatypes by unioning their value spaces and, consequently, their lexical spaces. elementFormDefault element substitution groups extension - A complex type definition which allows element or attribute content in addition to that allowed by another specified type definition is said to be an extension.

Copyright IBM Corporation 2004

Figure B-10. Terms (2 of 6)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Appendix B. Additional Information for XML Schema

B-11

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Terms (3 of 6)
facet - A single defining aspect of a value space. Generally speaking, each facet characterizes a value space along independent axes or dimensions. Facets are of two types: fundamental facets that define the datatype and non-fundamental or constraining facets that constrain the permitted values of a datatype. finalDefault fundamental facet - An abstract property which serves to semantically characterize the values in a value space.
global element - An element that is a subelement of the schema/document/main/root element only; that is, one of the elements whose scope is immediately below that of schema, itself. infoset - See the infoset specification at: http://www.w3.org/TR/2001/WD-xml-infoset-20010316. itemType - The atomic datatype that participates in the definition of a list datatype is known as the itemType of that list datatype.

lexical space - the set of valid literals for a datatype. list datatypes - Datatypes having values each of which consists of a finite-length (possibly empty) sequence of values of an atomic datatype. A list datatype can be derived from another datatype (its itemType) by creating a value space that consists of a finite-length sequence of values of its itemType. NCName - Represents XML "non-colonized" (no :)Names.

Copyright IBM Corporation 2004

Figure B-11. Terms (3 of 6)

XM3014.1

Notes:
See the specifications for additional information. This material is provided as an aid only.

B-12 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Terms (4 of 6)
NMTOKEN - NMTOKEN represents the NMTOKEN attribute type from XML 1.0 (Second Edition). The value space of NMTOKEN is the set of tokens that match the Nmtoken production in XML 1.0 (Second Edition). The lexical space of NMTOKEN is the set of strings that match the Nmtoken production in XML 1.0 (Second Edition). The base type of NMTOKEN is token. non-fundamental facet - see constraining facet. normalized value (of an element or attribute information item) - an initial value whose white space, if any, has been normalized according to the value of the whiteSpace facet of the simple type definition used in its validation: preserve - No normalization is done, the value is the normalized value; replace - All occurrences of #x9 (tab), #xA (line feed) and #xD (carriage return) are replaced with #x20 (space); collapse - Subsequent to the replacements specified above under replace, contiguous sequences of #x20s are collapsed to a single #x20, and initial and/or final #x20s are deleted. particle - element declaration, wildcard or model group. There are three varieties of model group: Sequence (the element information items match the particles in sequential order); Conjunction (the element information items match the particles, in any order); Disjunction (the element information items match one of the particles). primitive datatypes - Those that are not defined in terms of other datatypes; they exist ab initio. QName - Represents XML qualified names. The value space of QName is the set of tuples
{namespace name, local part}, where namespace name is an anyURI and local part is an NCName. The lexical space of QName is the set of strings that match the QName production of [Namespaces in XML].
Copyright IBM Corporation 2004

Figure B-12. Terms (4 of 6)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Appendix B. Additional Information for XML Schema

B-13

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Terms (5 of 6)
restriction - A type definition whose declarations or facets are in a one-to-one relation with those of another specified type definition, with each in turn restricting the possibilities of the one it corresponds to, is said to be a restriction. The specific restrictions might include narrowed ranges or reduced alternatives. Members of a type, A, whose definition is a restriction of the definition of another type, B, are always members of type B as well.
schema component - this is the generic term for the building blocks that comprise the abstract data model of the schema. schemata -

simple types - do not allow elements; may not carry attributes; e.g., a built-in type. simpleType element- The XML representation for a Simple Type Definition schema component is a <simpleType> element information item. TOKEN - Represents tokenized strings. The value space of token is the set of strings that do not contain the line feed (#xA) nor tab (#x9) characters, that have no leading or trailing spaces (#x20) and that have no internal sequences of two or more spaces. The lexical space of token is the set of strings that do not contain the line feed (#xA) nor tab (#x9) characters, that have no leading or trailing spaces (#x20) and that have no internal sequences of two or more spaces. The base type of token is normalizedString. Type Definition Hierarchy - Except for a distinguished ur-type definition, every type definition is, by construction, either a restriction or an extension of some other type definition. The graph of these relationships forms a tree known as the Type Definition Hierarchy. union datatypes - Datatype whose value spaces and lexical spaces are the union of the value spaces and lexical spaces of one or more other datatypes.
Copyright IBM Corporation 2004

Figure B-13. Terms (5 of 6)

XM3014.1

Notes:
See the specifications for additional information. This material is provided as an aid only.

B-14 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Terms (6 of 6)
ur-type definition - A distinguished ur-type definition is present in each XML Schema, serving as the root of the type definition hierarchy for that schema. The ur-type definition, whose name is anyType, has the unique characteristic that it can function as a complex or a simple type definition, according to context. Specifically, restrictions of the ur-type definition can themselves be either simple or complex type definitions. validation - the word valid and its derivatives are used to refer to determining local schema-validity, that is whether an element or attribute information item satisfies the constraints embodied in the relevant components of an XML Schema; value space - The set of values for a given datatype. Each value in the value space of a datatype is denoted by one or more literals in its lexical space.
XML Schema - A set of schema components.

Copyright IBM Corporation 2004

Figure B-14. Terms (6 of 6)

XM3014.1

Notes:
See the specifications for additional information. This material is provided as an aid only.

Copyright IBM Corp. 2001, 2004

Appendix B. Additional Information for XML Schema

B-15

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

B-16 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Appendix C. Whats New in WebSphere Studio V5.1.1

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Unit Objectives
Discuss Installation and migration Describe new features JDK 1.4.1 J2EE improvements Web Services changes List improvement for WebSphere Studio v5.1.1

Copyright IBM Corporation 2004

Figure C-1. Unit Objectives

XM3014.1

Notes:
For this presentation we will look at the main themes of WebSphere Studio v5.1.1. With these themes, there are also a number of new features which are spread throughout the tool. We will describe this new features in short detail.

C-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Install
If WebSphere Studio v5.1 is installed, v5.1.1 will automatically be installed over the existing installation. The install will remove the v5.1 installation before installing v5.1.1. Corresponding Third Party plug-ins for v5.1.1 will need to be installed. Upgrade to Remote Agent Controller v5.1.1 recommended.

Copyright IBM Corporation 2004

Figure C-2. Install

XM3014.1

Notes:
If WebSphere Studio Application Developer Version 5.1 is detected by the installation program, WebSphere Studio Application Developer Version 5.1.1 will automatically be upgraded over Version 5.1. The install will remove the 5.1 installation before installing 5.1.1. This is transparent to the user during install. 5.1.1 has the same reg. keys as 5.1 5.1.1 replaces 5.1 in Add/Remove 5.1 no longer exists on your system after installing 5.1.1

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Install - Software Requirements


Windows(R) XP Professional with Service Pack 1 Windows 2000 Professional with Service Pack 2 or higher Windows NT(R) Workstation or Server Version 4.0 with Service Pack 6a or higher Red Hat, Version 7.2 Red Hat, Version 8.0 SuSE, Version 7.2 SuSE, Version 8.1

Copyright IBM Corporation 2004

Figure C-3. Install - Software Requirements

XM3014.1

Notes:
One of the operating systems listed on this page must be installed before you install WebSphere Studio Application Developer v5.11. You will need a Web browser to view the readme files, the Installation Guide, and the Migration Guide. For information about supported database servers, Web application servers, and other software products, see the readme file located in the root of both the installation CD and the product installation directory.

C-4

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Migration
Migration Guide is included With Studio Application Developer Version 5.1.1, you can migrate code from 5.1 automatically Migrate code manually VisualAge for Java WebSphere Studio "Classic" WebSphere Studio Application Developer Version 4.0.x WebSphere Studio Application Developer Version 5 Beta, Early Availability, or General Availability WebSphere Studio Application Developer Version 5.0.1 WebSphere Studio Application Developer Version 5.1

Copyright IBM Corporation 2004

Figure C-4. Migration

XM3014.1

Notes:
If you install Application Developer in the default location the migration guide can be located here: C:\Program Files\IBM\WebSphere Studio\Application Developer\v5.1.1\migrate.html. WebSphere Studio Application Developer Version 5.1 can coexist with WebSphere Studio Application Developer Version 5.0.x or earlier. For coexistence, you can install into a different directory. WebSphere Studio Application Developer can coexist with other WebSphere Studio products. For instructions on safely migrating your existing projects from a previous version of WebSphere Studio Application Developer to Version 5.1.1, refer to the Migration Guide. As a precaution, it is recommended that you make a backup copy of your old workspaces prior to migrating to WebSphere Studio Application Developer Version 5.1.1.

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Whats New in WebSphere Studio


WebSphere v5.1 Test Environment, others optional JavaServer Faces support Struts tags in the Page Designer palette Application Templates Wizard J2EE Tools improvements New Page Designer functions New Web Site Designer functions New JDBC error log Support for JDK 1.4.1 Performance improvements

Copyright IBM Corporation 2004

Figure C-5. Whats New in WebSphere Studio

XM3014.1

Notes:
New to v5.1.1 is the corresponding WebSphere v5.1 Test environment. While the version numbers may not match, WebSphere Studio v5.1.1 is the primary development tool for WebSphere Application Server v5.1 and includes specific wizards, editors, and tools to build applications which can utilize the latest support in WebSphere Application Server v5.1. The Optional Universal Test Environment provides added flexibility for users at install time who wish to deploy to particular servers. One the largest enhancements in WebSphere Studio v5.1.1 is the support for Java Server Faces. JSF is currently in beta, and is expected to be final in 1Q2004. An added enhancement to the Struts tools are that the Struts tags are now part of the Page Designer palette. For the Application Templates Wizard new code generators targeting new platforms are now available. They generate the following applications: A standard Struts 1.1 application, an Edit mode of a portal application to be tested and deployed on WPS, An XHTML-MP based application to target pervasive devices.
C-6 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

A filter can be specified in the database connection dialog allowing users who have a database on an iSeries machine to connect to it. The PageDesigner has two new functions which are the support for WBMP file rendering, and a Page Template File Creation Wizard. The Web Site Designer has two new functions which are that you can now use the Object Palette to insert Web Site Designer objects, and you can now specify a Servlet URL. For the new JDBC error log, when a catalog import is launched from the DB Servers pane in the Data Perspective, an error dialog may be displayed upon completion. Though Application Developer 5.1.1 runs on JDK 1.3.1 it has support for JDK 1.4.1. WebSphere Studio Application Developer 5.1.1 has inherited updates from WebSphere Studio Workbench which is comprised of mostly bug fixes. WebSphere Studio Workbench 2.1.2 is a maintenance release to fix serious defects present in release 2.1.0 and 2.1.1. These changes only affect some plug-ins and features. Modified plug-ins have version id 2.1.2; plug-ins unchanged since the 2.1 release still have version id 2.1.0; plug-ins unchanged since the 2.1.1 release still have version id 2.1.1. Note, however, that all features now have version id 2.1.2 (even if none of their plug-ins changed). Maintenance release 2.1.2 includes all fixes made in 2.1.1 For a list of bug fixes go to: http://www.eclipse.org/eclipse/development/readme_eclipse_2_1_2.html#DefectsFixed The last main theme of WebSphere Studio v5.1 does not involve new wizards or tools, but performance improvements. Performance has be improved greatly in many use-cases and scenarios. Things such as installation and uninstall have been improved as well as startup and shutdown times. Many of the wizards, tools, and editors have been improved and will be noticeably faster to the end user.

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-7

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Optional Runtime Environments

Copyright IBM Corporation 2004

Figure C-6. Optional Runtime Environments

XM3014.1

Notes:
The 5.1.1 release supports the following WebSphere Application Server Universal Test Environments (WTEs): WAS v4 WTE - v4.0.7 (JDK 1.3.1) WAS v5.0 WTE - v5.0.2 (JDK 1.3.1) WAS v5.1 WTE - v5.1 (JDK 1.4.1) WAS Express WTE - 5.0.2 WAS Express WTE - 5.1 In 5.1.1 all five WTEs (4.0.7, 5.0.2, and 5.1) are optional, therefore the JDKs and relevant WAS jars are available at compile time. The user decides which WTEs they want to install on their machine, and which versions, if any, they may just test via remote deploy. The reason for making all three WTEs optional is that we expect that soon after the WebSphere Studio Application Developer 5.1.1 release, most customers will be building/deploying WAS v5 apps, with only a few building JDK 1.4.1/WAS 5.1 apps. However, through the

C-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

lifetime of WebSphere 5.1 and WebSphere Studio Application Developer release (at least three years), the mix will change. The implication of the WAS 5.0.2 and WAS 5.1 WTEs being optional is that we need to provide JDK and WAS jars needed at compile time in the Application Developer 5.1.1 install. Each runtime will have a stub directory which will contain the runtime jars needed to compile against, as well as the server configuration used for remote support. If a test environment is available, we'll use that for the server target (build path) and configuration instead, so these directories are really just used as backup. The new version of WAS WTE (v5.1) includes J9 support for hotcode replace + full speed debug.

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-9

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Java Server Faces (1 of 2)


JSF and WDO tooling is in beta state in 5.1.1 The components are labeled beta in the UI on the palette GUI Framework for developing J2EE application Basis for Rapid Application Development Enable more developers to build dynamic pages Requires reduced set of Java skills Many built-in functions

Copyright IBM Corporation 2004

Figure C-7. Java Server Faces (1 of 2)

XM3014.1

Notes:
Java Server Faces is an emerging standard (JSR 127) that provides a GUI framework for developing J2EE applications. This new technology is the basis for our RAD experience in Application Developer. Application Developer contains a set of JSF components to improve usability and to enable a low end developer such as an HTML coder, JavaScripter, or Lotus Notes developer to build dynamic JSPs with minimal coding and a reduced level of Java language skills. JSF provides many built-in functions such as input validation, switching to an alternative markup renderer, maintaining client session state, error handling, and event handling. JSF and WDO tooling is in beta state in 5.1.1 due to the fact that the JSR 127 is not yet finalized. For that reason we will not support customer apps built with 5.1.1 JSF or WDO components in later versions. The components are labeled beta in the UI on the palette.

C-10 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Java Server Faces (2 of 2)


What are JavaServer Faces? JavaServer Faces technology is a framework for building user interfaces for web applications. JavaServer Faces technology includes: A set of APIs for: representing UI components and managing their state, handling events and input validation, defining page navigation, and supporting internationalization and accessibility. A JavaServer Pages (JSP) custom tag library for expressing a JavaServer Faces interface within a JSP page.

Copyright IBM Corporation 2004

Figure C-8. Java Server Faces (2 of 2)

XM3014.1

Notes:
With the simple, well-defined programming model that JavaServer Faces technology provides, developers of varying skill levels can quickly and easily build Web applications by: assembling reusable UI components in a page, connecting these components to an application data source, and wiring client-generated events to server-side event handlers. With the power of JavaServer Faces technology, these web applications handle all of the complexity of managing the user interface on the server, allowing the application developer to focus on their application code.

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-11

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

J2EE Tools Improvements


EJB Client JAR support Server Targeting for J2EE projects EJB snippet support EJB Reference wizard support The EJB Bottom Up mapping scenario even when enterprise beans already exist

Copyright IBM Corporation 2004

Figure C-9. J2EE Tools Improvements

XM3014.1

Notes:
EJB Client JAR support Server Targeting for J2EE projects which allows a project to get the JDK JARs and all public JARs from the server on its build path. EJB snippet support has been added to aid in generation of EJB client access code. Snippets are nice since they provide code with variable replacements. The variables each have descriptions and default values. There will be initially two EJB snippets: Call an EJB create method and Call and EJB find method. EJB Reference wizard support has been improved to allow for the creation of cross EAR EJB references. This enhancement also relates to the EJB Client JAR creation mechanism. The EJB Bottom Up mapping scenario has been improved to allow the running of the bottom up tooling when enterprise beans already exist.

C-12 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Minor Improvements/Enhancements
New Web Site Designer functions You can now use the Object Palette to insert Web Site Designer objects You can now specify a Servlet URL Struts tags in the Page Designer palette For Struts tools, the Struts tags are now part of the Page Designer palette. New JDBC error log When a catalog import is launched from the DB Servers pane in the Data Perspective, the following error dialog may be displayed upon completion

Copyright IBM Corporation 2004

Figure C-10. Minor Improvements/Enhancements

XM3014.1

Notes:
The Web Site Designer has two new functions: You can now use the Object Palette to insert Web Site Designer objects You can now specify a Servlet URL Struts tags in the Page Designer palette For Struts tools, the Struts tags are now part of the Page Designer palette. New JDBC error log When a catalog import is launched from the DB Servers pane in the Data Perspective, the following error dialog may be displayed upon completion: Problems encountered while importing from catalog. Reason: One or more problems were reported while accessing the database.

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-13

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Web Services
New Seoul Discovery Dialog WSDK v5.1- Command line tools have a refresh from WSAD 5.1 WSDL2Client command enables you to build client program from wsdl. (all you need is location of wsdl file) The WSDL can be local or on the Web WS-I in 5.1 you could set for whole workspace. In 5.1.1 you can set preference per project level Default is warning Update in Deployment Descriptor Model See Editor

Copyright IBM Corporation 2004

Figure C-11. Web Services

XM3014.1

Notes:
The main enhancement for the Web Service tooling is the Seoul Discovery Dialog that can be used on a Java Server Page. The WSDL2Client tool generates Web service clients that are fully-deployable from one or more WSDL documents and optionally deploys them to the application server. To use this tool you need a WSDL file, the fully qualified path of which cannot contain a space, or the compile script will not run properly. The WS-I Basic Profile is a outline of requirements to which WSDL and Web service protocol (SOAP/HTTP) traffic must comply in order to claim WS-I conformance. In 5.1 you could set for whole workspace. In 5.11 you can set your preference per project level.

C-14 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

JDK/SDK 1.4.1 New Features


New functions and changes included in Java SDK 1.4.1 XML Processing Security JDBC 3.0 API JNDI enhancements Logging APIs WebSphere App. Server will start using this in the future Applications can make use of new Java SDK 1.4.1 APIs For complete API and other new functions reference, refer to Java 2 SDK 1.4.1 specification: http://java.sun.com/j2se/1.4.1/docs/relnotes/features.html

Copyright IBM Corporation 2004

Figure C-12. JSK/SDK 1.4.1 New Features

XM3014.1

Notes:
Heres a look at what is new in the language that underlies WebSphere 5.1. The Logging APIs used in SDK 1.4.1 are not integrated in WebSphere 5.1, but are available for customer code to use, and are expected to be supported in a future release of WebSphere. Full details on the specification can be found at the web site listed on the slide in the Java 2 SDK 1.4.1 specification: http://java.sun.com/j2se/1.4.1/docs/relnotes/features.html.

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-15

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

JDK 1.4.1 Enablement WebSphere API Changes


Why ? Some JDK 1.4.1 APIs changed added methods that conflict with existing WebSphere APIs with the same method name and input signature, but with a different return type Exception Handling Changes to Throwable interface Throwable added method StackTraceElement[] getStackTrace() Conflicted with com.ibm.websphere.servlet.error.ServletErrorReport class (implements Throwable) java.lang.String getStackTrace() method Solution change the name from getStackTrace to getStackTraceAsString() in ServletErrorReport class

Copyright IBM Corporation 2004

Figure C-13. JDK 1.4.1 Enablement - WebSphere API Changes

XM3014.1

Notes:
Some JDK 1.4.1 APIs changed and added methods that conflict with some of the existing WebSphere APIs with the same method name and input signature, but with a different return type. Therefore some of the WebSphere APIs have been modified.

C-16 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

JDK 1.4.1 XML Parser Compatibility


JDK 1.4.1 includes JAX API and parser/transformer implementation JDK 1.4 shipped with WebSphere will have Xerces 4.2.2 - JAXP 1.2, SAX 2.01, DOM 2 WebSphere v5.1 will remove the Xerces/Xalan parser that was shipped in WebSphere v5.0.X, since JDK now includes the JAXP implementation If you bundle your own XML parser within your application, it will continue to work Using the PARENT_LAST class loader setting in the Application Server, so it uses the user supplied XML parser If you were using WebSphere supplied Parser in v5.0.2 and before, then it will default to the JDK XML Parser If you are using directly Apache Xerces APIs, some of these APIs have changed For example type casting has changed If you are using JAXP APIs, then this will not affect you Best practice use JAXP APIs for XML processing instead of the direct Apache Xerces APIs
Copyright IBM Corporation 2004

Figure C-14. JDK 1.4.1 - XML Parser Compatibility

XM3014.1

Notes:
The Java API for XML processing has been added to the Java 2 Platform. It provides basic support for processing XML documents through a standardized set of Java Platform APIs. JDK 1.4.1 includes JAXP API and parser/transformer implementations. Versions of JAXP, SAX, DOM that are shipped with IBM JDK are: Xerces 4.2.2 JAXP 1.2, SAX 2.01, DOM 2. Versions of JAXP, SAX, DOM that are shipped with NON IBM JDK are: Crimson JAXP 1.1, SAX 2.0, DOM 2. Note: If you move from WebSphere v4.x to WebSphere v5.1, and you are using direct Apache Xerces APIs, this may change. This may have worked when going from WebSphere v4.x to WebSphere v5.0.X.

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-17

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

JDK 1.4.1 Security Changes


Following Security APIs/functions are now in JDK 1.4.1 Java Cryptography Extension (JCE) Java Secure Socket Extension (JSSE) Java Authentication and Authorization Service (JAAS) Certification Path APIs (Certpath) Java Generic Security Services API (JGSS) PKCS and S/MIME security features

Copyright IBM Corporation 2004

Figure C-15. JDK 1.4.1 Security Changes

XM3014.1

Notes:
There are two new security features which include the Java GSS-API can be used for securely exchanging messages between communicating applications using the Kerberos V5 mechanism. The Java Certification Path API includes new classes and methods in the java.security.cert package that allow you to build and validate certification paths (also known as certificate chains). Due to import control restrictions, the JCE jurisdiction policy files shipped with the Java 2 SDK, v 1.4 allow strong but limited cryptography to be used. An unlimited version of these files indicating no restrictions on cryptographic strengths is available. The JSSE implementation provided in this release includes strong cipher suites. However, due to U.S. export control restrictions, this release does not allow alternate pluggable SSL/TLS implementations to be used. For more information, please see the JSSE Reference Guide.

C-18 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

With the integration of JAAS into the Java 2 SDK, the java.security. Policy API handles Principal-based queries, and the default policy implementation supports Principal-based grant entries. Now access control can now be based not just on what code is running, but also on who is running it. Also, support for dynamic policies has been added. In Java 2 SDK releases prior to version 1.4, classes were statically bound with permissions by querying security policy during class loading. The lifetime of this binding was scoped by the lifetime of the class loader. In version 1.4 this binding is now deferred until needed by a security check. The lifetime of the binding is now scoped by the lifetime of the security policy. Finally the graphical Policy Tool utility has been enhanced to enable specifying a Principal field indicating what user is to be granted specified access control permissions.

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-19

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Security Tools - Kerberos


Three new tools to support Kerberos Tickets: kinit helps users obtain Kerberos tickets klist helps users list Kerberos tickets ktab helps users manage Kerberos tickets Klist is a command-line tool to list entries in credential cache and key tab. Ktab is a command-line tool to help the user manage entries in the key table.

Copyright IBM Corporation 2004

Figure C-16. Security Tools - Kerberos

XM3014.1

Notes:
There are 3 new tools to support Kerberos tickets. These tools help users obtain, list and manage Kerberos tickets. kinit - is a tool for obtaining Kerberos v5 tickets. klist - is a command-line tool to list entries in credential cache and key tab. Equivalent functionality is available on the Solaris operating environment via the klist tool. ktab - is a command-line tool to help the user manage entries in the key table.

C-20 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

JDK 1.4.1 JDBC 3.0 Support


JDBC 3.0 Not required for J2EE 1.3 Part of J2EE 1.4 WebSphere v5.1 ignores the new JDBC 3.0 APIs, if you use WebSphere Connection Manager ORB / EJB Container POA (Portable Object Adapter) will not be supported in v5.1 We will continue to ship IBM ORB in non-IBM JDKs, as before

Copyright IBM Corporation 2004

Figure C-17. JDK 1.4.1 - JDBC 3.0 Support

XM3014.1

Notes:
The JDBC 3.0 API, comprised of packages java.sql and javax.sql, provides universal data access from the Java programming language. Using the JDBC 3.0 API, you can access virtually any data source, from relational databases to spreadsheets and flat files. JDBC technology also provides a common base on which tools and alternative interfaces can be built. New features include the ability to set savepoints in a transaction, to keep result sets open after a transaction is committed, to reuse prepared statements, to get metadata about the parameters to a prepared statement, to retrieve keys that are automatically generated, and to have multiple result sets open at one time. There are two new JDBC data types, BOOLEAN and DATALINK, with the DATALINK type making it possible to manage data outside of a data source. This release also establishes the relationship between the JDBC Service Provider Interface and the Connector architecture.

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-21

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Java Naming and Directory Interface (JNDI)


An Internet Domain Naming System (DNS) service provider has been added that enables applications to read data stored in the DNS. The JNDI Lightweight Directory Access Protocol (LDAP) service provider has security enhancements. Enables applications to establish secure sessions over existing LDAP connections. Enables applications to use different authentication protocols. The JNDI CORBA Object Services (COS) naming service provider now supports Interoperable Naming Service (INS).

Copyright IBM Corporation 2004

Figure C-18. Java Naming and Directory Interface (JNDI)

XM3014.1

Notes:
Enhancements Since Version 1.4.0 DNS Service Provider Support for controlling timeouts when submitting UDP queries. Support for automatic discovery of DNS service. LDAP Service Provider Support for connection pooling. Support for automatic discovery of LDAP service via DNS. Support for use of multiple URLs for configuration.

C-22 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

JVM Enhancements
Signal-chaining facility Error-reporting mechanism New command-line option for performing additional Java Native Interface (JNI) checks New facility for logging garbage-collection events

Copyright IBM Corporation 2004

Figure C-19. JVM Enhancements

XM3014.1

Notes:
The Java virtual machines in this release include several enhancements. Signal-chaining facility. Signal-chaining enables the Java Platform to better interoperate with native code that installs its own signal handlers. Support for pre-installed signal handlers when the HotSpot VM is created. Support for signal handler installation after the HotSpot VM is created, inside JNI code or from another native thread. 64-bit support on Solaris-SPARC platform edition. Error-reporting mechanism The information provided by the new error-reporting mechanism will allow developers to more easily and efficiently debug their applications. If an error message indicates a problem in the JVM code itself, it will allow a developer to submit a more accurate and helpful bug report. New command-line option for performing additional Java Native Interface (JNI) checks.
Copyright IBM Corp. 2001, 2004 Appendix C. Whats New in WebSphere Studio V5.1.1 C-23

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

New facility for logging garbage-collection events. Rapid memory allocation and garbage collection provides for rapid memory allocation for objects, and it has a fast, efficient, state-of-the-art garbage collector. The Classic virtual machine is no longer shipped as part of the Java 2 SDK.

C-24 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

AWT
New focus architecture New full-screen exclusive mode API supports high performance graphics by suspending the screen Headless support is now enabled that indicates whether a display, keyboard, and mouse can be supported in a graphics environment Mouse Wheel support Now 64-bit compliant

Copyright IBM Corporation 2004

Figure C-20. AWT

XM3014.1

Notes:
Changes to the AWT package center on improving the robustness, behavior, and performance of programs that present a graphical user interface. A new focus architecture replaces the previous implementation and addresses many focus-related bugs caused by platform inconsistencies, and incompatibilities between AWT and Swing components. The new full-screen exclusive mode API supports high performance graphics by suspending the windowing system so that drawing can be done directly to the screen; a benefit to applications like games, or other rendering-intensive applications. Headless support is now enabled by new graphics environment methods that indicate whether a display, keyboard, and mouse can be supported in a graphics environment. The ability to disable native frame decorations is now available for applications which need to take full control of specifying how a frame will look; when enabled this prevents the rendering of a native title bar, system menu, border, or other native operating system dependent screen components. The oft-requested mouse wheel, with a scroll wheel in place of the middle mouse button, is enabled with new built-in Java support for scrolling via the mouse wheel. Also, a new mouse wheel listener class allows customization of mouse

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-25

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

wheel behavior. The AWT package has been modified to be fully 64-bit compliant and now runs on Solaris machines with 64-bit and 32-bit addresses.

C-26 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Swing
New Spinner Allows a component user to select a number or a value by cycling through a sequence of values using a tiny pair of up/down arrow buttons New formatted text field Component allows formatting of dates, numbers, and strings New drag and drop architecture Provides seamless drag and drop support between components Progress bar Uses constant animation to show that a time-consuming operation is occurring Scrollable tabs Now supported in tabbed pane component Popup and popup factory Classes exposed and made public New Focus architecture Fully integrated into Swing
Copyright IBM Corporation 2004

Figure C-21. Swing

XM3014.1

Notes:
Many new features have been added to Swing. The new spinner component is a single line input field that allows the user to select a number or a value by cycling through a sequence of values using a tiny pair of up/down arrow buttons. The new formatted text field component allows formatting of dates, numbers, and strings, such as a text field that accepts only decimal money values. The Windows look and feel implementation is updated to track features available in the 2000/98 versions. A new drag and drop architecture provides seamless drag and drop support between components as well as an easy way to implement drag and drop in your customized Swing components - writing a couple of methods which describe the particulars of your data model is all that is required. Swing's progress bar component has been enhanced to support an indeterminate state; rather than showing the degree of completeness, the indeterminate progress bar uses constant animation to show that a time-consuming operation is occurring. Due to great customer demand, the tabbed pane component has been enhanced to support scrollable tabs. With this feature enabled, if all the tabs will not fit within a single tab run, the tabbed pane component will display a single, scrollable run of tabs, instead of wrapping the tabs onto multiple runs. The popup and popup factory classes, which were previously
Copyright IBM Corp. 2001, 2004 Appendix C. Whats New in WebSphere Studio V5.1.1 C-27

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

package private, have been exposed and made public so that programmers may customize or create their own pop-ups. The new focus architecture is fully integrated into Swing.

C-28 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Java Logging APIs


The JavaTM Logging APIs, introduced in package java.util.logging The core package includes support for delivering Plain text XML-formatted logs

Copyright IBM Corporation 2004

Figure C-22. Java Logging APIs

XM3014.1

Notes:
The Java Logging APIs facilitate software servicing and maintenance at customer sites by producing log reports suitable for analysis by end users, system administrators, field service engineers, and software development teams. The Logging APIs capture information such as security failures, configuration errors, performance bottlenecks, and/or bugs in the application or platform. Logger: The main entity on which applications make logging calls. A Logger object is used to log messages for a specific system or application component. LogRecord: Used to pass logging requests between the logging framework and individual log handlers. Handler: Exports LogRecord objects to a variety of destinations including memory, output streams, consoles, files, and sockets. A variety of Handler subclasses exist for this purpose. Additional Handlers may be developed by third parties and delivered on top of the core platform.

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-29

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Level: Defines a set of standard logging levels that can be used to control logging output. Programs can be configured to output logging for some levels while ignoring output for others. Filter: Provides fine-grained control over what gets logged, beyond the control provided by log levels. The logging APIs support a general-purpose filter mechanism that allows application code to attach arbitrary filters to control logging output. Formatter: Provides support for formatting LogRecord objects. This package includes two formatters, SimpleFormatter and XMLFormatter, for formatting log records in plain text or XML respectively. As with Handlers, additional Formatters may be developed by third parties.

C-30 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Java Debugging
Full Speed Debugging Support HotSwap allows a class to be updated while under the control of a debugger Instance Filters Support For Debugging Other Languages VMDeathRequests

Copyright IBM Corporation 2004

Figure C-23. Java Debugging

XM3014.1

Notes:
Full Speed Debugging Support In the previous versions when debugging was enabled, the program executed using only the interpreter. Now, the full performance advantage of is available to programs running with debugging enabled. The improved performance allows long running programs to be more easily debugged. It also allows testing to proceed at full speed and the launch of a debugger to occur on an exception. HotSwap has been added to allow a class to be updated while under the control of a debugger. EventRequests now have the capability of specifying an instance filter, which restricts the events generated by the request to those in which the currently executing instance is the object specified. There is now support For Debugging Other Languages.

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-31

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

The Java Platform Debugger Architecture has been extended so that non-Java programming language source, which is translated to Java programming language source, can be debugged in the future. With VMDeathRequests a request can now be made to control target VM termination notification, allowing clean shutdown synchronization. Using class VMDeathRequest, a request can be made for notification when the target VM terminates. When an enabled VMDeathRequest is satisfied, an EventSet containing a VMDeathEvent will be placed on the EventQueue.

C-32 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

RMI
RMI Server-side Stack Traces Now Retained in Remote Exceptions Service Provider Interface for RMIClassLoader Dynamic Server Host Name Serialization Support for deserialization of objects that are known to be unshared in the data-serialization stream Support for a class-defined readObjectNoData method Important bug fixes

Copyright IBM Corporation 2004

Figure C-24. RMI

XM3014.1

Notes:
Server-side Stack Traces Now Retained in Remote Exceptions. The RMI runtime implementation will now preserve the server-side stack trace information of an exception that is thrown from a remote call, in addition to filling in the client-side stack trace as it did previous releases. Therefore, when such an exception becomes accessible to client code, its stack trace will now contain all of its original server-side trace data followed by the client-side trace. Service Provider Interface for RMIClassLoader. Certain static methods of java.rmi.server.RMIClassLoader now delegate their behavior to an instance of a new service provider interface, java.rmi.server.RMIClassLoaderSpi. This service provider object can be configured to augment RMI's dynamic class loading behavior for a given application. By default, the service provider implements the standard behavior of all of the static methods in RMIClassLoader.

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-33

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Javadoc 1.4.1
New Features Added package and class names as keywords in <META> tags. This should improve search results for search engines that look at <META> tags Bug Fixes About two dozen bug fixes Running Javadoc Error/Warning Messages Command Line Options Tags Miscellaneous

Copyright IBM Corporation 2004

Figure C-25. Javadoc 1.4.1

XM3014.1

Notes:
Fixed bug where it mistakenly documented .class files found on classpath (if they belonged to packages passed in on the command line). Fixed -use option, which was severely broken. Fixed -link option to handle absolute paths. Fixed -encoding option for reading source files. Fixed {@docRoot} which had been inserting an extra slash /. Fixed {@inheritDoc}, which did not work. Added interface constants to the Constant Field Values list.

C-34 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Summary of New Features and Enhancements


XML Processing New I/O APIs Security Java 2D technology Image I/O Framework Java Print Service AWT Swing Drag and Drop Logging API Java Web Start Product Long-term Persistence of JavaBeans Components JDBC 3.0 API Assertion Facility Preference API Chained Exception Facility Endorsed Standards Override Mechanism Java Virtual Machines Performance Networking Support, including IPv6 RMI Serialization Java Naming and Directory Interface (JNDI) CORBA, Java IDL, and RMI-IIOP Java Platform Debugger Architecture Internationalization Java Plug-in Product Collections Framework Accessibility Regular Expressions Math Reflection Java Native Interface Tools and Utilities

Copyright IBM Corporation 2004

Figure C-26. Summary of New Features and Enhancements

XM3014.1

Notes:
There are some enhancements to all the areas listed on this chart. For additional details on each of these features, you can refer to the Java 2 SDK, Standard Edition, version 1.4.1 information online at http://java.sun.com/j2se/1.4.1/docs/relnotes/features.html

Copyright IBM Corp. 2001, 2004

Appendix C. Whats New in WebSphere Studio V5.1.1

C-35

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Unit Summary
New Rapid Application Development Tools Support of Java Server Faces Build and manage Web site development rather than page development Numerous of additional improvements and enhancements present in WSAD v5.1.1 and JDK 1.4.1

Copyright IBM Corporation 2004

Figure C-27. Unit Summary

XM3014.1

Notes:
WebSphere Studio v5.1.1 offers a number of new features, with the focus around new Rapid Application Development with Java Server Faces. There are also new Web site tool enhancements and noticeable performance improvements. You will also find many smaller enhancements which have been added throughout the product.

C-36 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Appendix D. Additional Information and Examples

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Welcome to:

Unit 5. Document Type Definitions

Copyright IBM Corporation 2004 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3.1

Figure D-1. Unit 5. Document Type Definitions

XM3014.1

Notes:

D-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Declaring Attributes in a DTD - Example


Declaration: <!ELEMENT book (#PCDATA)> <!ATTLIST book id ID isbn CDATA booktype (Hardcover|Paperback) storeloc CDATA year CDATA comment CDATA > Valid XML: <book isbn="1-56592-709-5" storeloc="Times Square">XML Pocket Reference</book>

#IMPLIED #REQUIRED "Paperback" "5th Avenue" #FIXED "2000" #IMPLIED

Copyright IBM Corporation 2004

Figure D-2. Declaring Attributes in a DTD - Example

XM3014.1

Notes:
In the example: The id is a unique ID but is not required. The isbn number is required, but has no default value. The booktype is required (a validating parser will supply "Paperback" from the DTD's supplied default value). The storeloc attribute is not required, if it is not present, it would default to "5th Avenue", however here it was overridden by assigning the value "Times Square". The year attribute gets its fixed value from the DTD. No comment attribute was included; since this is an implied attribute, its use is optional. However, if it were used, it would change the start tag to <book isbn="1-56592-709-5" booktype storeloc="Times Square" year comment="A handy XML pocket reference.">

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Attribute Value Processing


XML processors perform two steps when processing attributes: Attribute value normalization. Replace all entity references by their replacement text. Replace all space, tab, carriage-return, and line-feed characters with spaces. Whitespace normalization for tokenized* types. Discard any leading or trailing spaces. Replace any sequence of space characters (#x20) by a single space character. * Described on the following chart.

Copyright IBM Corporation 2004

Figure D-3. Attribute Value Processing

XM3014.1

Notes:
In order to understand attribute value types, you need to understand how attribute values are processed by an XML processor. When an XML processor processes a document it performs two operations on the values of attributes: 1. Attribute value normalization a. Replace ENTITY references with their replacement text (more on ENTITYs later). b. Convert space, tab, cr, lf character to a space. 2. Whitespace crunching The processor trims leading and trailing whitespace and replaces multiple #x20's (spaces) by a single one. *NMTOKEN (NaMeTOKEN) or Nmtoken are but two examples of a tokenized type. This is discussed in greater detail later. The URL is: http://www.w3.org/TR/1998/REC-xml-19980210
D-4 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Aside. . .Whitespace, Names, and Attributes


In XML, a name is a token that Begins with a letter (but not X,x,M,m,L, or l) or _ or : Followed by any of the characters specified by ISO/IEC 10646 Whitespace consists of one or more Space (#x20) characters Carriage returns Tabs An Nmtoken (name token) is any mixture of name characters An Nmtokens or NMTOKENS is an Nmtoken followed by Whitespace followed by another Nmtoken followed by . . . A Names is a Name followed by Whitespace followed by another Name followed by . . . AttributeType can be one of these String type, which is CDATA or Tokenized type, or Enumerated type
Copyright IBM Corporation 2004

Figure D-4. Aside. . .Whitespace, Names, and Attributes

XM3014.1

Notes:
The URL for ISO/IEC 10646 is http://www.w3.org/TR/199/REC-xml-19980210#ISO 10646 Look at these terms as part of the vocabulary of XML. The precision is necessary, in part, because of the international nature of XML. The 1998 XML 1.0 specification literally defines whitespace as S where S::(#x20 | #x9 |#xD |#xA)+ where the | represents or and the + means at least one. These terms permeate the XML vocabulary. We'll have more to say about them throughout this course. The specification uses these categories to catalog attribute types. The categories and their contents are covered on subsequent charts.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Attribute Types: String Type: CDATA Example


Syntax:
<!ATTLIST elementName attributeName CDATA defaultDecl>

Declaration:
<!ELEMENT box EMPTY> <!ATTLIST box height CDATA #REQUIRED> <!ATTLIST box width CDATA #REQUIRED>

Valid XML fragment:


<box height="32" width="12"/>

or
<box height="#32" width="1#@%^ 2"/>

Invalid XML fragment (use of & inside the attribute value):


<box height="#32" width="1#@%&^ 2"/>

Copyright IBM Corporation 2004

Figure D-5. Attribute Types: String Type: CDATA Example

XM3014.1

Notes:

D-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Attribute Types: Tokenized Types: NMTOKEN Example


Syntax:
<!ATTLIST elementName attributeName NMTOKEN defaultDecl>

Declaration:
<!ELEMENT employee (#PCDATA)> <!ATTLIST employee serialNumber NMTOKEN #REQUIRED>

Valid XML fragment:


<employee serialNumber="e00001">Joe Smith</employee> <employee serialNumber="e00001">Bill Smith</employee> <employee serialNumber="e00001">John Smith</employee>

Invalid XML fragments:


<employee serialNumber="#00001">Joe Smith</employee> <employee serialNumber="e 00001">John Smith</employee>
Copyright IBM Corporation 2004

Figure D-6. Attribute Types: Tokenized Types: NMTOKEN Example

XM3014.1

Notes:
This foil shows a declaration for a required attribute of type NMTOKEN. The valid example is valid because all the serialNumber values are proper name tokens; that they are identical is irrelevant. The first invalid example is invalid because # is not a valid name token so #00001 is also invalid. The second example is invalid because of the whitespace between the first and second characters.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-7

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Attribute Types: Tokenized Types: NMTOKENS Example


Syntax:
<!ATTLIST elementName attributeName NMTOKENS defaultDecl>

Declaration:
<!ELEMENT employee (#PCDATA)> <!ATTLIST employee serialNumber NMTOKENS #REQUIRED>

Valid XML fragment:


<employee serialNumber="e00001">Joe Smith</employee> <employee serialNumber="e00001">Bill Smith</employee> <employee serialNumber="e00001">John Smith</employee> <employee serialNumber="e 00001">John Smith</employee>

Invalid XML fragments:


<employee serialNumber="# 0 0001">Joe Smith</employee>

Copyright IBM Corporation 2004

Figure D-7. Attribute Types: Tokenized Types: NMTOKENS Example

XM3014.1

Notes:
This foil shows a declaration for a required attribute of type NMTOKENS. The valid example is valid because all the serialNumber values are proper name tokens; that they are identical is irrelevant. The difference between this chart and the previous chart is that the plural nature of NMTOKENS now permits interspersing the name tokens with whitespace. The invalid example is invalid because # is not a valid name token.

D-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Attribute Types: Tokenized Types: ID Example


Syntax:
<!ATTLIST elementName attributeName ID defaultDecl>

Declaration:
<!ELEMENT employee (#PCDATA)> <!ATTLIST employee serialNumber ID #REQUIRED>

The rules for XML names requires that the 1st character not be a number. Valid XML fragment:
<employee serialNumber="e00001">Joe Smith</employee> <employee serialNumber="e00002">Bill Smith</employee> <employee serialNumber="_00002">John Smith</employee>

Invalid XML fragment:


<employee serialNumber="e00001">Joe Smith</employee> <employee serialNumber="e00001">John Smith</employee> <employee serialNumber="00002">John Smith</employee>
Copyright IBM Corporation 2004

Figure D-8. Attribute Types: Tokenized Types: ID Example

XM3014.1

Notes:
This foil shows a declaration for a required attribute of type ID. According to the syntax rules for ID's, numbers cannot be ID's. That is why the serialNumber values begin with a letter. The valid example is valid because all the serialNumber values are distinct even though the second and third values have the same numerical part. Arguably the numerical overlap may be unintentional, perhaps from the merger of two departments each with a similar XML. The invalid example is invalid because the first two employee's have the same serialNumber values and the third employee's serial number starts with an illegal character.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-9

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Attribute Types: Tokenized Types: IDREF Example


Syntax:
<!ATTLIST elementName attributeName IDREF defaultDecl>

Declaration:
<!ELEMENT <!ATTLIST <!ATTLIST <!ATTLIST employee employee employee employee (#PCDATA)> serialNumber ID #REQUIRED> manager1 IDREF #IMPLIED> manager2 IDREF #IMPLIED>

Valid XML fragment:


<employee serialNumber="e00001">Joe Smith</employee> <employee serialNumber="e00002">Bill Smith</employee> <employee serialNumber="e00003" manager1="e00001">John Smith</employee> <employee serialNumber="e00003" manager1="e00001" manager2="e00002">John Smith</employee>

Invalid XML fragment: manager1="e00004" if e00004 is


not an element within the document.
Copyright IBM Corporation 2004

Figure D-9. Attribute Types: Tokenized Types: IDREF Example

XM3014.1

Notes:
This foil shows a declaration for an implied attribute of type IDREF. According to the syntax rules for IDs, numbers cannot be ID's. That is why the serialNumber values begin with a letter. Aside from naming rules, managerN could have any value as long as there is an element with that value defined. Consequently, an employee could be self-managed! The uniqueness constraint applies to IDs not to IDREFs so the employee could be self-managed twice: both manager1 and manager2 could have the same value.

D-10 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Attribute Types: Enumerated Types: Enumeration Example


Syntax:
<!ATTLIST elementName attributeName (evalA|evalB| ..) defaultDecl>

Declaration:
<!ELEMENT shirt(#PCDATA)> <!ATTLIST shirt size (small|medium|large) #REQUIRED>

Valid XML:
<shirt size="small">plaid polyester</shirt> <shirt size="large">white poplin</shirt>

Invalid XML:
<shirt size="XXL">navy pullover</shirt>

Copyright IBM Corporation 2004

Figure D-10. Attribute Types: Enumerated Types: Enumeration Example

XM3014.1

Notes:
This foil shows an example of an attribute with an enumerated value. Valid documents will only have values listed in the list of values. The valid examples both take their size values from the list of small, medium, or large.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-11

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

What Is an Entity?
"Reasonable people may disagree: many of us believe...that entities are simply slightly-constrained storage units with no other special significance..." David Megginson An entity can be internal (to the XML instance before you) or external (to the XML instance before you) If it is external, that is a good reason to use the standalone attribute in the XML Declaration statement (standalone="no") An entity can be a parsed or unparsed entity parsed entities
Reference textual content only Get replaced by the actual content during parsing Can be used anywhere in the document

unparsed entities
Can reference any type of content Must be associated with a NOTATION Parser only passes entity and notation data to the application Can only be used as an attribute value, not valid as element content

Copyright IBM Corporation 2004

Figure D-11. What Is an Entity?

XM3014.1

Notes:
David Megginson's work in XML is well-known. Another common analogy likens an entity to a macro, on the one hand, and a constant, on the other hand. This arguably better captures the internal/external duality of entities. Go back to the ENTITYattribute.xml example and see what error you receive if you delete the NDATA declaration from the ENTITY statement. (This breaks the tie to the NOTATION statement.) The following charts further develop these concepts.

D-12 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

What Is Allowed. . . Declaring ENTITYs


Syntax:
<!ENTITY entityName "replacementText">

Usage: To use an ENTITY, place a reference to the entityName in the XML document. References to the entityName are replaced with the replacement text. ENTITYs can be declared internal to the file or external using a URI in place of the "replacementText".*
<!ENTITY entityName SYSTEM "URI">

Examples follow. . .but first: Parsed entities Unparsed entities

*URI = Uniform Resource Identifier: URLs are examples of URIs.

Copyright IBM Corporation 2004

Figure D-12. What Is Allowed. . . Declaring ENTITYs

XM3014.1

Notes:
An XML document can be composed of many storage units. These storage units are called entities. The document is composed by combining all of the storage units. There are different names and rules for entities that are used in the document proper and for entities which appear in the DTD. The next group of charts examine the various combinations available in a DTD to introduce additional information.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-13

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Parsed Entities
Parsed entities Reference textual content only Get replaced by the actual content during parsing Can be used anywhere in the document

Copyright IBM Corporation 2004

Figure D-13. Parsed Entities

XM3014.1

Notes:
Generally speaking, an entity is a reference to content. The entity reference is really just a placeholder for the content it refers to. When the parser replaces the entity name with the replacement text, it does the replacement, and then continues parsing as normal. This means that the replacement text is actually parsed AFTER the entity replacement is done. If your entity content contains XML, then it must be well-formed (and valid if doing a validating parse). If the replacement text contains the characters < or &, they must be escaped. As a best practice, >, ', and " should also be escaped. If the replacement text contains XML markup, it must be well-formed/valid. Caution: when escaping characters, always be careful not to lose the ; because of its small size relative to other characters.

D-14 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Built-in Entities
The five built-in character entities in XML:
entity &lt; &gt; &quot; &apos; &amp; description "less than" "greater than" "quote" "apostrophe" "ampersand" character < > " ' &

To reference other characters, use decimal or hex unicode values Syntax:


&#nnn; (nnn is a decimal value) &#xnnn; (nnn is a hex value)

Examples:
&#32; = space &#x20; = space &#960; = &#x03C0; =

Copyright IBM Corporation 2004

Figure D-14. Built-in Entities

XM3014.1

Notes:
XML also contains five built-in entities used to refer to characters that are reserved by XML. Without the built-in entities, if you write a '<" less than sign in a document, the processor may be unable to determine whether you are trying to start a tag or not. There are places where you can use the actual character instead of the built-in entity, but we recommend always using the built-in entity, except for CDATA, to avoid problems. As a best practice, these entities should be defined in your DTD, and any use of the characters should be escaped. However, in practice, only the & always needs to be escaped. In attribute values, the ", and ' also needs to be escaped. In element content the < also needs to be escaped. This is a good place to mention character references as well. XML is based on Unicode, and there are more characters in the Unicode character set than there are on most keyboards in the world. In order to allow you to enter any Unicode character, XML provides a mechanism for character references. Character references look very similar to entity references, except that they begin with "&#" instead of "&". You can specify the Unicode character you want by providing the base 10 number of the character between the &# and ;
Copyright IBM Corp. 2001, 2004 Appendix D. Additional Information and Examples D-15

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

By using &#x instead of &#, you can provide a hexadecimal number instead of a decimal number. When the XML processor encounters the character reference, it will insert the proper Unicode character. 960 is the math character "pi" (3.141...); x03c0 is its hexadecimal equivalent.

D-16 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

What Is Allowed. . . Declaring NOTATIONs


NOTATIONs describe unparsed data. A NOTATION specifies a URI that will be used by the XML processor to invoke a helper application to process data in that notation.

Copyright IBM Corporation 2004

Figure D-15. What Is Allowed. . . Declaring NOTATIONs

XM3014.1

Notes:
XML provides notations as a way of describing unparsed data - frequently this is data that is not XML, such as binary data. Notation names appear as the value of NOTATION valued attributes and as part of the specification of an unparsed entity. The declaration for a notation follows in the next chart. Along with the name of the notation, this is really the only support for processing notation data.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-17

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

NOTATIONs and Unparsed Entities


Definition: NOTATIONs Describe how unparsed data should be handled by the application Syntax:
<!NOTATION notationName PUBLIC "publicURI" "systemURI"> <!NOTATION notationName PUBLIC "publicURI"> <!NOTATION notationName SYSTEM "systemURI">

Definition: Unparsed entities Can reference any type of content Must be associated with a NOTATION Parser only passes entity and notation data to the application Can only be used as an attribute value, not valid as element content Always external to the document Syntax:
<!ENTITY entityName SYSTEM "systemURI" NDATA notationName> <!ENTITY entityName PUBLIC "publicURI" "systemURI" NDATA notationName>
Copyright IBM Corporation 2004

Figure D-16. NOTATIONs and Unparsed Entities

XM3014.1

Notes:
Another way to declare an unparsed entity is <!ENTITY entityName PUBLIC "publicURI" "systemURI" NDATA notationName>. In this case, the parser must understand the publicURI. When processing an unparsed entity or notation, the parser does not actually go out and retrieve the information. Rather, it parses the entity and/or notation information to the application. It is the application's responsibility to determine how to use that information to process the unparsed entity data. An unparsed entity is always external to the document. Unlike the other declaration structures, when using a notation that references a public URI, a system URI does not always need to be provided. This is because the notation information is not actually processed by the parser. It is meant only has a hint for the application to use in order to process the associated data. One way to look at this is that a notation provides type information, and an unparsed entity references an external file of that type.
D-18 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

An example follows.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-19

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Declaring NOTATIONs and Unparsed Entities: Example


Declaration:
<!NOTATION txt SYSTEM "/usr/bin/vi"> <!NOTATION html PUBLIC "http://www.mozilla.org"> <!NOTATION pdf PUBLIC "http://www.adobe.com" "/acrobat.exe"> <!ENTITY company PUBLIC "http://www.ibm.com" "ibm.html" NDATA html> <!ENTITY location SYSTEM "buildingInfo.txt" NDATA txt> <!ELEMENT consultant (#PCDATA)> <!ATTLIST consultant companyAtts ENTITIES #REQUIRED resume CDATA #REQUIRED resumefrmt NOTATION (txt | pdf) #IMPLIED >

Valid XML:
<consultant companyAtts="company location" resume="emp1.pdf" resumefrmt="pdf">Scott Karabin </consultant>
Copyright IBM Corporation 2004

Figure D-17. Declaring NOTATIONs and Unparsed Entities: Example

XM3014.1

Notes:
Here is an example. In practice, the ENTITY declarations would most likely appear before its associated NOTATION. However, in a large complex DTD with many references to unparsed entities -example, a library-type application -- it is advisable to put all the NOTATIONs in a separate section and use comments to mark the section.

D-20 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

NOTATIONs - Another Example


<!NOTATION jpeg SYSTEM "file:///c:/Program Files/Photoshop/photoshop.exe"> <!NOTATION gif SYSTEM "file:///c:/Program Files/Photoshop/photoshop.exe"> <!ELEMENT person (name, picture?)> <!ELEMENT name (#PCDATA)> <!ELEMENT picture EMPTY> <!ATTLIST picture filename CDATA #REQUIRED picformat NOTATION (jpeg | gif) #REQUIRED>

Valid XML:
<person> <name>Kelly Brown</name> <picture filename="kbrown.jpg" picformat="jpeg"/> </person>

Copyright IBM Corporation 2004

Figure D-18. NOTATIONs - Another Example

XM3014.1

Notes:
In the example, the notation-typed picformat attribute specifies "jpeg" as its value. An application can ask the XML parser for the helper application associated with the jpeg notation, and will get back the "photoshop.exe" value, which it can the use in processing the data. Again, the parser would perform no validation; it simply passes on the attribute's value. It is up to the application to handle the information it has received. The NOTATION typed attribute does not provide the actual unparsed data -- it provides information on how to process the unparsed data.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-21

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Welcome to:

Unit 7. XML Schema

Copyright IBM Corporation 2004 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3.1

Figure D-19. Unit 7. XML Schema

XM3014.1

Notes:

D-22 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

1.1 Simple Type (simpleType) Definition (1 of 2)


A simple type definition is a set of constraints on strings and information about the values they encode, applicable to the normalized value of an attribute information item or of an element information item with no element children. Informally, it applies to the values of attributes and the text-only content of elements. Each simple type definition, whether built-in (that is, defined in [XML Schemas: Datatypes]) or user-defined, is a restriction of some particular simple base type definition. For the built-in primitive types, this is the simple version of the ur-type definition, whose name is anySimpleType. This is in turn understood to be a restriction of the ur-type definition. Simple types may also be defined whose members are lists of items themselves constrained by some other simple type definition, or whose membership is the union of the memberships of some other simple type definitions. List and union simple type definitions are also understood as restrictions of the simple ur-type definition.
Copyright IBM Corporation 2004

Figure D-20. 1.1 Simple Type Definition (1 of 2)

XM3014.1

Notes:
For additional information as a schema component see http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ Section 2.2.1.2 Simple Type Definition. For detailed information on simple type definitions, see http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ Simple Type Definitions (3.14) and [XML Schemas: Datatypes]. The latter also defines an extensive inventory of predefined simple types.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-23

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

1.1 Simple Type (simpleType) Definition (2 of 2)


This is the specification for a simpleType element:
<simpleType final = (#all | (list | union | restriction)) id = ID name = NCName {any attributes with non-schema namespace . . .}> Content: (annotation?, (restriction | list | union)) </simpleType>

Copyright IBM Corporation 2004

Figure D-21. 1.1 Simple Type Definition (2 of 2)

XM3014.1

Notes:

D-24 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Built-in versus User-derived Datatypes


Built-in datatypes are those which are defined in the specification Can be either primitive or derived; User-derived datatypes are those derived datatypes that are defined by individual schema designers. Conceptually there is no difference between the built-in derived datatypes included in the specification and the user-derived datatypes which will be created by individual schema designers. The built-in derived datatypes are those which are believed to be so common that if they were not defined in the specification many schema designers would end up reinventing them. A built-in datatype need not be a built-in datatype in any programming language used to implement the specification. Likewise, a user-derived datatype need not be a user-derived datatype in any programming language used to implement the specification.

Copyright IBM Corporation 2004

Figure D-22. Built-in versus User-derived Datatypes

XM3014.1

Notes:
Additional information in http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/ Section 2.5.3 "Built-in versus user-derived datatypes."

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-25

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Applicable Facets
Two categories of simple types from the XML perspective: Ordered Non-ordered These facets apply to both: Length minLength maxLength pattern enumeration whiteSpace The ordered simple types are Byte unsignedByte Integer positiveInteger negativeInteger nonNegativeInteger nonPositiveInteger int unsignedInt long unsignedLong short unsignedShort Decimal float double Time dateTime duration date gMonth gYear gYearMonth gDay gMonthDay These facets apply only to ordered, simple types maxInclusive maxExclusive minInclusive minExclusive totalDigits fractionDigits
Copyright IBM Corporation 2004

Figure D-23. Applicable Facets

XM3014.1

Notes:
You are advised to refer to the Specifications and Primer for examples of how to implement facets. Our examples are necessarily limited by time and space.

D-26 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Element Only and Mixed Complex Types


Element Only Declare using a model group or compositor
<xsd:complexType name="elementOnlyType"> <xsd:sequence> <xsd:element name="firstName" type="xsd:string"/> <xsd:element name="lastName" type="xsd:string"/> </xsd:sequence> <xsd:complexType>

Mixed Add mixed="true" attribute on complexType element and then declare as with Element Only content model
<xsd:complexType name="mixedType" mixed="true"> <xsd:sequence> <xsd:element name="firstName" type="xsd:string"/> <xsd:element name="lastName" type="xsd:string"/> </xsd:sequence> </xsd:complexType>

Copyright IBM Corporation 2004

Figure D-24. Element Only and Mixed Complex Types

XM3014.1

Notes:
The Element only content model can only contain elements as children to declare this content model use a model group or compositor as the child of the <complexType> element. Mixed (elements plus character data) add mixed='true' attribute on complexType (and provide a model group/compositor. The Mixed model in Schema is a little different. The order and number of child elements matters.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-27

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

1.2 Complex Type Definition (1 of 4)


A complex type definition is a set of attribute declarations and a content type, applicable to the [attributes] and [children] of an element information item respectively. The content type may require the [children] to Contain neither element nor character information items (that is, to be empty), Be a string which belongs to a particular simple type, or Contain a sequence of element information items which conforms to a particular model group, with or without character information items as well. Each complex type definition is either A restriction of a complex base type definition or An extension of a simple or complex base type definition or A restriction of the ur-type definition.

Copyright IBM Corporation 2004

Figure D-25. 1.2 Complex Type Definition (1 of 4)

XM3014.1

Notes:
For additional information as a schema component see http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ Section 2.2.1.3 Complex Type Definition. For detailed information on complex type definitions, see http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ Complex Type Definitions (3.4).

D-28 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

1.2 Complex Type Definition (2 of 4)


A complex type which extends another does so by having additional content model particles at the end of the other definition's content model, or by having additional attribute declarations, or both. Version 1.0 allows only appending, and not other kinds of extensions. This decision simplifies application processing required to cast instances from derived to base type. Future versions may allow more kinds of extension, requiring more complex transformations to effect casting.

Copyright IBM Corporation 2004

Figure D-26. 1.2 Complex Type Definition (2 of 4)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-29

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

1.2 Complex Type Definition (3 of 4)


This is the specification for a complexType element:
<complexType abstract = boolean : false block = (#all | List of (extension | restriction)) final = (#all | List of (extension | restriction)) id = ID mixed = boolean : false name = NCName {any attributes with non-schema namespace . . .}> Content: (annotation?, (simpleContent | complexContent |

((group | all | choice | sequence)?, ((attribute | attributeGroup)*, anyAttribute?))))


</complexType>

If you use this form, you will use a name= For example, po2.xsd file: <xsd:complexType name="Address"> In the file we added abstract="false" but from the above you can see that that is the default; it could be omitted.
Copyright IBM Corporation 2004

Figure D-27. 1.2 Complex Type Definition (3 of 4)

XM3014.1

Notes:
Any time you see language like "this is the specification. . ." it means that whatever immediately follows is directly from the W3C XML Schema specification. We used bolding to draw attention to the two part nature of this specification: The top part, which ends with the > at the end of the 8th line would include the attributes associated with this complexType. The remainder carries the content, which can include subelements, attributes, and the other choices shown. When a complex Type is declared "abstract" it cannot be used in an instance document. If you change the value of abstract in <xsd:complextType name="Address" abstract="false"> to "true" you will receive an error. But not until you try to validate the associated instance document. And then the complaint will be about billTo. Even though there is no element in the instance by the name of Address its definition is used to define the contents of the billTo element.

D-30 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Reverting the value to "false" and saving the xsd file does NOT clear the error in the .xml file: the error will not be cleared until you rerun the validator on the instance file (or make some insignificant change like adding, then removing, a blank space and then "save" the result).

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-31

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

1.2 Complex Type Definition (4 of 4)


Studio uses this pattern for an (anonymous) complexType element:
<element name=NCName> <complexType > Content: (annotation?, (simpleContent | complexContent | ((group | all | choice | sequence)?, ((attribute | attributeGroup)*, anyAttribute?)))) </complexType> </element>

The example shown is for "item" as defined in poBetterYet.xsd


<xsd:element name="item"> <xsd:complexType> <xsd:sequence> <xsd:element ref="productName" /> ... <xsd:choice> <xsd:element ref="comment" /> <xsd:element ref="shipDate" /> </xsd:choice> </xsd:sequence> <xsd:attribute name="partNum" type="xsd:string" use="optional" /> </xsd:complexType> </xsd:element>
Copyright IBM Corporation 2004

Figure D-28. 1.2 Complex Type Definition (4 of 4)

XM3014.1

Notes:
Any reference to "Studio does this or that" indicates that whatever follows came from either the Help plug-in or the help built-in to the source editor(s). Recall that an NCName is a "non-colonized" name. That is, no :. A complexType that has no name= is known as an anonymous complex type definition.

D-32 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Declaration Components
There are three kinds of declaration component: Element Attribute Notation Each is described in a section that follows. Also included is a discussion of element substitution groups. This is a feature provided in conjunction with element declarations. Recall from the chart that in categorized schema components: Element and attribute declarations are considered primary schema components Notation is a secondary schema component

Copyright IBM Corporation 2004

Figure D-29. Declaration Components

XM3014.1

Notes:
This is covered in Section 2.2.2 in Part 1 of the Specification.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-33

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

1.3 Element Declarations (1 of 2)


Element declarations provide for: Local validation of element information item values using a type definition; Specifying default or fixed values for an element information items; Establishing uniquenesses and reference constraint relationships among the values of related elements and attributes; Controlling the substitutability of elements through the mechanism of element substitution groups. An element declaration is An association of a name with a type definition
Either simple or Complex

An (optional) default value and A (possibly empty) set of identity-constraint definitions.

Copyright IBM Corporation 2004

Figure D-30. 1.3 Element Declarations (1 of 2)

XM3014.1

Notes:
Part 1 of the Specification Section 2.2.2.1 Element Declaration For detailed information on element declarations, see Element Declarations (3.3).

D-34 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

1.3 Element Declarations (2 of 2)


The association is either global or scoped to a containing complex type definition. A top-level element declaration with name 'A' is broadly comparable to a pair of DTD declarations as follows, where the associated type definition fills in the ellipses: <!ELEMENT A . . .> <!ATTLIST A . . .> Element declarations contribute to validation as part of model group validation, when their defaults and type components are checked against an element information item with a matching name and namespace, and by triggering identity-constraint definition validation.

Copyright IBM Corporation 2004

Figure D-31. 1.3 Element Declarations (2 of 2)

XM3014.1

Notes:
For detailed information on element declarations, see Element Declarations (3.3) in the specification.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-35

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

1.3 Element Specification


This is the specification for an element element:
<element abstract = boolean : false block = (#all | List of (extension | restriction | substitution)) default = string final = (#all | List of (extension | restriction)) fixed = string form = (qualified | unqualified) id = ID maxOccurs = (nonNegativeInteger | unbounded) : 1 minOccurs = nonNegativeInteger : 1 name = NCName nillable = boolean : false ref = QName substitutionGroup = QName type = QName {any attributes with non-schema namespace . . .}> Content: (annotation?, ((simpleType | complexType)?, (unique | key | keyref)*)) </element>

How these information items are employed depends on where they occur in the schema . . .:
Copyright IBM Corporation 2004

Figure D-32. 1.3 Element Specification

XM3014.1

Notes:

D-36 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Global Element (1 of 2)
An element that has been declared under schema (that is, the root element) and not as part of a complex type defintion. This is how Studio sees it Notice Studio uses two views to help us

Copyright IBM Corporation 2004

Figure D-33. Global Element (1 of 2)

XM3014.1

Notes:
Once we have defined a global element we can refer to it (ref=) as often as necessary.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-37

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Global Element (2 of 2)
Additional information (and hints) can be obtained by using a combination of the hover help and opening a pick list.

Hover help: here it defines the term associated with the highlighted item Drop down pick list to identify legal choices

Copyright IBM Corporation 2004

Figure D-34. Global Element (2 of 2)

XM3014.1

Notes:

D-38 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

2.2 Identity-Constraint Definitions


Identity-constraint definition components provide for uniqueness and reference constraints with respect to the contents of multiple elements and attributes. Example XML representations for the three kinds of identity-constraint definitions:
<xs:key name="fullName"> <xs:selector xpath=".//person"/> <xs:field xpath="forename"/> <xs:field xpath="surname"/> </xs:key> <xs:keyref name="personRef" refer="fullName"> <xs:selector xpath=".//personPointer"/> <xs:field xpath="@first"/> < xs:field xpath="@last"/> </xs:keyref> <xs:unique name="nearlyID"> <xs:selector xpath=".//*"/> <xs:field xpath="@id"/> </xs:unique>
Copyright IBM Corporation 2004

Figure D-35. 2.2 Identity-Constraint Definitions

XM3014.1

Notes:
Refer to http://www.w3.org/TR/xmlschema-1/#cIdentity-constraint_Definitions for additional details.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-39

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

The Model Group Schema Component


The model group schema component has the following properties: Schema Component: Model Group {compositor} One of all, choice or sequence. {particles} A list of particles {annotation} Optional. An annotation. {compositor} specifies a sequential (sequence), disjunctive (choice) or conjunctive (all) interpretation of the {particles}. This in turn determines whether the element information item [children] validated by the model group must: (sequence) correspond, in order, to the specified {particles}; (choice) corresponded to exactly one of the specified {particles}; (all) contain all and only exactly zero or one of each element specified in {particles}. The elements can occur in any order. In this case, to reduce implementation complexity, {particles} is restricted to contain local and top-level element declarations only, with {min occurs}=0 or 1, {max occurs}=1. When two or more particles contained directly or indirectly in the {particles} of a model group have identically named element declarations as their {term}, the type definitions of those declarations must be the same. By 'indirectly' is meant particles within the {particles} of a group which is itself the {term} of a directly contained particle, and so on recursively.
Copyright IBM Corporation 2004

Figure D-36. The Model Group Schema Component

XM3014.1

Notes:

D-40 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

3.3 Particles
As described in Model Groups, particles contribute to the definition of content models. Example. XML representations which all involve particles, illustrating some of the possibilities for controlling occurrence:
<xsd:element ref="egg" minOccurs="12" maxOccurs="12"/> <xsd:group ref="omelette" minOccurs="0"/> <xsd:any maxOccurs="unbounded"/>

Particles are element declarations, wildcards and model groups

Copyright IBM Corporation 2004

Figure D-37. 3.3 Particles

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-41

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

3.4 Wildcards
In order to exploit the full potential for extensibility offered by XML plus namespaces, more provision is needed than DTDs allow for targeted flexibility in content models and attribute declarations. A wildcard provides for validation of attribute and element information items dependent on their namespace name, but independently of their local name. Example. XML representations of the four basic types of wildcard, plus one attribute wildcard:
<xsd:any processContents="skip"/> <xsd:any namespace="##other" processContents="lax"/> <xsd:any namespace="http://www.w3.org/1999/XSL/Transform"/> <xsd:any namespace="##targetNamespace"/> <xsd:anyAttribute namespace="http://www.w3.org/XML/1998/namespace"/>

Copyright IBM Corporation 2004

Figure D-38. 3.4 Wildcards

XM3014.1

Notes:

D-42 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Putting a Schema in a Namespace (1 of 2)


How do I indicate that a namespace is to be used with elements? The elementFormDefault attribute.
<xsd:schema xmlns="http://www.ibm.com/Schemas/WD03/target" xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.ibm.com/Schemas/WD03/target" elementFormDefault="qualified"> <xsd:element name="quantity" type="xsd:integer"/> </xsd:schema>

Copyright IBM Corporation 2004

Figure D-39. Putting a Schema in a Namespace (1 of 2)

XM3014.1

Notes:
We also need to tell the schema processor that the default form for element names is that they be qualified. We do this using the elementFormDefault attribute.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-43

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Putting a Schema in a Namespace (2 of 2)


How do I indicate that a namespace is to be used with attributes? The attributeFormDefault attribute.
<xsd:schema xmlns="http://www.ibm.com/Schemas/WD03/target" xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.ibm.com/Schemas/WD03/target" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xsd:element name="quantity" type="xsd:integer"/> </xsd:schema>

By setting the attributeFormDefault to unqualified we are explicitly stating we recognize that attributes do not need to be namespace-qualified.

Copyright IBM Corporation 2004

Figure D-40. Putting a Schema in a Namespace (2 of 2)

XM3014.1

Notes:
We also need to tell the schema processor that the default form for attribute names is that they be qualified. We do this using the elementFormDefault attribute. If we do nothing they will not need to be qualified. If we set it to unqualified we are indicating we have thought about it and made this decision. You can test this in the lab exercise for this lecture.

D-44 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Some Practical Examples (1 of 2)


The XML301 Lectures project folder contains several examples at various levels of complexity in the Namespaces folder of the Unit 7. XML Schema folder. Here is one example where we imported two existing schema into a third schema, Calendar.xsd <?xml version="1.0" encoding="UTF-8"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.ibm.com" xmlns:Calendar="http://www.ibm.com" xmlns:course="http://www.utoronto.ca" xmlns:Job="http://www.job.com"> <import schemaLocation="MyCourse.xsd" namespace="http://www.utoronto.ca"></import> <import schemaLocation="Job.xsd" namespace="http://www.job.com"></import> <element name="appointment"> <complexType> <choice minOccurs="0" maxOccurs="unbounded"> <element ref="course:Schedule"></element> <element ref="Job:JobInfo"></element> </choice> <attribute name="startTime" type="string"></attribute> <attribute name="endTime" type="string"></attribute> </complexType> </element> </schema

Copyright IBM Corporation 2004

Figure D-41. Some Practical Examples (1 of 2)

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-45

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Some Practical Examples (2 of 2)


Here is the corresponding instance file, Calendar.xml <?xml version="1.0" encoding="UTF-8"?> <Calendar:appointment endTime="" startTime="" xmlns:Calendar="http://www.ibm.com" xmlns:Job="http://www.job.com" xmlns:course="http://www.utoronto.ca" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com Calendar.xsd http://www.job.com Job.xsd http://www.utoronto.ca MyCourse.xsd "> <course:Schedule> <course:course> <course:courseId>course:courseId</course:courseId> <course:description>course:description</course:description> </course:course> <course:location>course:location</course:location> </course:Schedule> <Job:JobInfo> <Job:description>Job:description</Job:description> <Job:salary>Job:salary</Job:salary> </Job:JobInfo> </Calendar:appointment> xsi:schemaLocation provides direction as to where to look for the associated schema If the address is wrong you will receive a great many errors The agreement of the file names is a good first place to check.
Copyright IBM Corporation 2004

Figure D-42. Some Practical Examples (2 of 2)

XM3014.1

Notes:

D-46 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

What We Didn't Talk About


Character patterns in type definitions Using regular expressions to create sophisticated type definitions
An ISBN type

Type extension When your type structure could use inheritance.


"A seaplane is a kind of airplane" "A bus is a kind of vehicle"

Redefinition of imported types You want to use some, but not all of a set of imported types. Nil Values When you need to represent the absence of something as a value itself. When migrating from a database with null values. Uniqueness / Identity Constraints You need to specify that a set of things is unique. You need to make sure that a set of things is keyed to a second set of unique items.
Copyright IBM Corporation 2004

Figure D-43. What We Didn't Talk About

XM3014.1

Notes:
Here is a list of the major topics that we didn't have time to cover in this course. Type derivation XML Schema supports programming language-like inheritance. Redefinition of imported types When importing types from another namespace it is also possible to redefine or extend via inheritance the types being imported. Nil Values As we alluded to when we talked about the XML Schema Instance namespace, XML Schema supports a notion of nil (null) values. Uniqueness / Identity Constraints XML Schema allows the specification of Uniqueness and identity constraints using a subset of the XPath specification.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-47

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

In addition, we were unable to cover all the details of the features that we presented today.

D-48 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Status
XMLSchema V1.0 - W3C Recommendation as of 5/2/2001 Three parts to specification:
Part 0: Primer Part 1: Structures Part 2: Datatypes

Foundation for revisions of W3C recommendations


XSLT 2.0 XPath 2.0 XQuery

XMLSchema V1.1 - Work is underway to assemble requirements for this revision which is intended to remain mostly compatible with 1.0, fix bugs and make minor improvements. Visit http://www.w3.org/XML/Schema for more information.

Copyright IBM Corporation 2004

Figure D-44. Status

XM3014.1

Notes:
XML Schema was accepted as a W3C recommendation on 5/2/2001. This means that the specification should be considered stable, and that implementors will begin producing schema processors compliant with the recommendation. This is the version that you should deploy or consider deploying. All previous working drafts, candidate recommendations, or proposed recommendations are superseded by the recommendation. The XML Schema recommendation documents consist of three parts: Three parts to spec Part 0: Primer (This is a tutorial introduction to the features of XML Schema) Part 1: Structures (This document covers everything except the specific simple types provided by XML Schema). Part 2: Datatypes (This document covers each of the simple types supported by the XML Schema recommendation).

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-49

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Moreover, XML Schema will be a part of the foundation for the next generation of W3C technologies. XSLT 2.0, XPath 2.0, and XQuery are among the technologies that will use XML Schema as a foundation to build on. Foundation for revisions of W3C technologies: XSLT 2.0 XPath 2.0 XQuery

D-50 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Tooling (1 of 2)
Can use any text editor As long as the editor supports Unicode or the chosen encoding IBM WebSphere Studio 4+ (all editions) Guided document editing for documents based on XMLSchemas (or DTDs) Generate an XMLSchema from a DTD Syntax aware XMLSchema Editor Include Xerces-J 1.4.2 Apache Software Foundation Xerces-J 1.4+ (REC) University of Edinburgh XSV (REC) Oracle Oracle XML Parser (CR)

Copyright IBM Corporation 2004

Figure D-45. Tooling (1 of 2)

XM3014.1

Notes:
As of (today) there is limited support for XML Schema in XML parser implementations. Here is a list of the implementations available. Apache Software Foundation Xerces-J 1.4+ (REC) This parser is written in Java and forms the basis for IBM's XML Parser for Java product. Aside from bugs, Xerces 1.4 supports the recommendation syntax for XML Schema. University of Edinburgh XSV (REC) This parser is written in Python and supports the Proposed Recommendation Syntax for XML Schema. Oracle Oracle XML Parser (PR) This parser is written in Java and supports the proposed recommendation syntax for XML Schema. Microsoft
Copyright IBM Corp. 2001, 2004 Appendix D. Additional Information and Examples D-51

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

MSXML 4.0 (PR) This parser is written in C/C++ and supports the proposed recommendation syntax for XML Schema.

D-52 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Tooling (2 of 2)
Microsoft MSXML 4.0 (REC) Free IBM Alphaworks tools to help you: XML Schema Quality Checker Visual DTD
Editing environment with syntax-directed help. Found in the package called "Visual XML Tools"

Data Descriptor by Example to generate the Schema from sample XML


Write sample XML that illustrates all the ways you'll use the data

Copyright IBM Corporation 2004

Figure D-46. Tooling (2 of 2)

XM3014.1

Notes:
Here are some other tools that may be helpful as you work with XML Schemas.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-53

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Welcome to:

Unit 9. XSL Transformations

Copyright IBM Corporation 2004 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3.1

Figure D-47. Unit 9 XSL Transformations

XM3014.1

Notes:

D-54 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Variables and Parameters


Variables and parameters are very useful. Variables Use $varName to retrieve content <xsl:variable name="varName" select="expr"/> <xsl:variable name="varName">...</xsl:variable> Parameters Mainly used to call named templates. <xsl:param name="parmName" select="expr"/> <xsl:param name="parmName">...</xsl:param>

Copyright IBM Corporation 2004

Figure D-48. Variables and Parameters

XM3014.1

Notes:
Variables - used to declare a local or global variable in a stylesheet <xsl:variable name="country" select="germany"/> will get the element <germany> <xsl:variable name="country" select="'germany'"/> will set country to 'Germany' Parameters - to describe a global parameter.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-55

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Parameters Example
<Named> <AAA repeat="3"/> <BBB repeat="2"/> <CCC repeat="5"/> </Named>
<xsl:template match="/Named/*"> <p> <xsl:call-template name="while"> <xsl:with-param name="test"> <xsl:value-of select="@repeat"/> </xsl:with-param> </xsl:call-template> </p> </xsl:template> <xsl:template name="while"> <xsl:param name="test"/> <xsl:value-of select="name()"/> <xsl:text> </xsl:text> <xsl:if test="not($test = 1)"> <xsl:call-template name="while"> <xsl:with-param name="test"> <xsl:value-of select="$test - 1"/> </xsl:with-param> </xsl:call-template> </xsl:if> Transformation.xsl </xsl:template>

Input.xml

<p>AAA AAA AAA </p> <p>BBB BBB </p> <p>CCC CCC CCC CCC CCC </p>

Output.html

Copyright IBM Corporation 2004

Figure D-49. Parameters Example

XM3014.1

Notes:

D-56 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Variable Example
<list> <book ID="666"> <chapter>First Chapter</chapter> <chapter>Second Chapter</chapter> <chapter>Third Chapter</chapter> </book> </list>

Input.xml
<xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version="1.0"> <xsl:variable name="totalChapters"> <xsl:value-of select="//chapter[last()]" /> </xsl:variable> <xsl:template match="/"> <xsl:value-of select="$totalChapters" /> </xsl:template> </xsl:stylesheet>

Third Chapter

Output.html

Transformation.xsl
Copyright IBM Corporation 2004

Figure D-50. Variable Example

XM3014.1

Notes:

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-57

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

No Reassigning Variables
Variables cannot be reassigned. We want pure, side-effect free functions which are not dependent upon order of execution. Conditional Initialization. Want to assign a variable based on a condition.
<xsl:variable name="oddOrEven"> <xsl:choose> <xsl:when test="even(current())">even</xsl:when> <xsl:otherwise>odd</xsl:otherwise> </xsl:choose> </xsl:variable>

Often variables are used for doing two tasks at once. Divide the templates into two templates. Example, calculate min and max of a list of nodes. No counters or counted for-loops. Write recursive template instead.

Copyright IBM Corporation 2004

Figure D-51. No Reassigning Variables

XM3014.1

Notes:
Functions/templates in XSLT are independent of each other and do not have any external dependencies. They can be called in any order against the same XML file and the same results will be obtained. Variable reassignment is not allowed to avoid creating dependencies among functions. In a functional programming language, you would calculate the min and max by creating 2 variables and looping through the nodes, reassigning the min and max based on tests against the node. In XSLT, you should use recursive templates instead of looping.

D-58 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Common Named Template Example (1 of 2)


You have an order page which is editable at times and read-only at other times. Add a readOnly attribute to your "page" XML tag. Option1: write 2 separate XSLT files. Option2: have common templates which have choose logic which evaluates the readOnly attribute and determines the appropriate output.
<xsl:template name="controlInput"> test whether element <xsl:param name="controlName" /> is readOnly <xsl:param name="value" /> <xsl:param name="isReadOnly"/> <xsl:choose> output text <xsl:when test="$isReadOnly='true'"> because is <xsl:value-of select="$value" /> readOnly </xsl:when> <xsl:otherwise> <input class="inputField" name="{$controlName}" size="20" value="{$value}" /> </xsl:otherwise> </xsl:choose> output input field </xsl:template> because is editable
Copyright IBM Corporation 2004

Figure D-52. Common Named Template Example (1 of 2)

XM3014.1

Notes:
In this example, a common template has been developed for input fields. The template tests will output the value passed in as a simple string if the parameter for isReadOnly is true. If isReadOnly is fals, an input field with the value shown will be output. This example also has the controlName and the value being passed in as parameters. The values for size and class have been hardcoded. So this template is only good for cases where 20 is the correct size for the input field and the class in inputField. To make this template more flexible you would add parameters for class and size.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-59

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Common Named Template Example (2 of 2)


<list> <author readOnly="true">Dr. Smith<author> <author readOnly="false">Elton John<author> </list>

Input.xml
<xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version="1.0"> <xsl:include href="CommonTemplates.xsl" /> <xsl:template match="/author"> <xsl:call-template name="controlInput"> <xsl:with-param name="controlName" select="position()"/> <xsl:with-param name="value" select="."/> <xsl:with-param name="isReadOnly"> select="@readOnly"/> </xsl:call-template> </xsl:template> </xsl:stylesheet>

Dr. Smith <input class="inputField" name="2" size="20" value="Elton John" />

Output.html

<xsl:stylesheet xmlns:xsl='http://www.w3.org /1999/XSL/Transform' version="1.0"> <xsl:template name="controlInput"> ... see previous page </xsl:template> </xsl:stylesheet>

Transformation.xsl
Copyright IBM Corporation 2004

CommonTemplates.xsl

Figure D-53. Common Named Template Example (2 of 2)

XM3014.1

Notes:
The first author, Dr. Smith, is output as a text string since it is readOnly. The second author, Elton John, results in an input field since it is not readOnly. The common named template is kept in a separate file that can be included in other files and the named templates can then be called by those other templates. Three parameter values are needed by the controlInput template; controlName, value, and isReadOnly. These parameter values are chosen from the current author node. The controlName is taken from the position number - we want it to be unique for each author node. The value is the value of the author element. The value of readOnly is taken from the attribute isReadOnly. You can easily call the same controlInput template for a different element in the XML as long as you can determine values for controlName, value, and isReadOnly.

D-60 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

XSL Is at Home on the Client


HTML Stream

XSL

XML Stream

Style Sheet

query Server Client Browser support: Netscape 6.x and IE 6.0 have W3C-Compliant XSL transformation capability IE 5.0 and 5.5 provide support only for the last W3C working draft The W3C has a browser/editor (Amaya) that supports rendering of XML via CSS, or via hierachal view

Copyright IBM Corporation 2004

Figure D-54. XSL Is at Home on the Client

XM3014.1

Notes:
Since we push XML directly to the client, the raw XML is available for further processing on the client (for example, export to a spreadsheet or local database). To make IE5.x it compliant: You need to download the latest MSXML3. As long as your XML file refers to the appropriate .XSLT stylesheet, it should render in IE5. If you are running side-by-side installation of MSXML, IE will use the old XSLT processor. To unregister the old processor and tell IE to use the new one, type the following four commands at a command prompt: regsvr32 msxml3.dll xmlinst Note: In the final release, xmlinst.exe is a separate download from MSDN, and does not come with the MSXML download. You can download xmlinst.exe here.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-61

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

More information about installing for replace mode is found in the online documentation that comes with the updated parsers. You may also wish to download the "Internet Explorer Tools for Validating XML and Viewing XSLT Output". Also IE5 does not validate the XML documents. And doesn't let you view the source of the XSL Stylesheet. The following link points to an update that validates the XML document and enables viewing the XSL Stylesheet. http://msdn.microsoft.com/code/default.asp?url=/code/sample.asp?url=/msdn-files/027/00 0/543/msdncompositedoc.xml Future versions of IE (V6) and Mozilla are supposed to have XSLT support.

D-62 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

And at Work on the Server

XSL
SQL Translator

HTML Stream

SQL

Style Sheet

Client

Middle-tier Server
Xerces/Xalan (Apache) Xerces/Jigsaw (Apache/W3C)

Legacy Data Store

Copyright IBM Corporation 2004

Figure D-55. And at Work on the Server

XM3014.1

Notes:
XSLT Processor running on the Web Server.

Copyright IBM Corp. 2001, 2004

Appendix D. Additional Information and Examples

D-63

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

D-64 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Appendix E. Bibliography and References


Framework Reference Listing
Web Sites
Framework Information Gartner Group NCF Review e-business more e-business www.software.ibm.com/ebusiness www.gartner.com/webletter/ibminternet/defa ult2.html www.ibm.com/e-business www.software.ibm.com/ebusiness

Books
Network Computing Framework Component Guide IBM Redbook

SG24-2119

Products That Map to the Framework


Clients
Web Sites Lotus Home Page Lotus Notes Netscape Communicator Microsoft Internet Explorer IBM Network Station www.lotus.com Lotus Notes home.netscape.com/browsers/index.html www.microsoft.com/ie www.pc.ibm.com/networkstation

Books
Lotus Notes Release 4.5: A Developer's Handbook IBM Redbook
Appendix E. Bibliography and References E-1

SG24-4876

Copyright IBM Corp. 2001, 2004

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

SR23-7804 SR23-8207 SG24-2127 SG24-2016

Lotus Notes and Domino Server 4.5 Unleashed Lotus Notes and Domino Server 4.6 Unleashed IBM Network Station Guide for Windows NT IBM Redbook RS/6000 - IBM Network Station Guide A Companion Guide IBM Redbook

Servers Web Sites


IBM WebSphere Application Server Apache Domino Mail Domino How to Upgrade from Domino Go Webserver to Domino White Paper Domino.Doc Domino Intranet Starter Pack IBM WebSphere Complementary Products

www.software.ibm.com/webservers www.apache.org www.lotus.com/home.nsf/tabs/dms www.lotus.com/home.nsf/rightframe/1domino www.networking.ibm.com/eli/eliwpupg.html www.lotus.com/dominodoc www.lotus.com/home.nsf/tabs/disp www.software.ibm.com/webservers/other.html

Books
The Lotus Domino server (OS/390) IBM Redbook

SG24-2083

E-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Development Tools Web Sites


IBM Visual Age Developer's Domain IBM VisualAge for Java Gamelan's Java Directory NetObjects Fusion Domino.Action Lotus eSuite IBM WebSphere Studio IBM AlphaWorks Java API Documentation JavaScript Documentation www7.software.ibm.com/vad/.nsf www.software.ibm.com/ad/vajava www.developer.com/directories www.netobjects.com www.net.lotus.com/action4/action.nsf esuite.lotus.com www.software.ibm.com/webservers/studio/in dex.html www.alphaWorks.ibm.com/Home java.sun.com/products/jdk/1.1/docs/index.ht ml devedge.netscape.com/docs/manuals/js/clie nt/jsref

Books
Developing Web Applications Using Lotus Notes Designer for Domino 4.6 IBM Redbook Programming with VisualAge for Java 1.0 IBM Redbook VisualAge for Java Enterprise Version2: Data Access Beans - Servlets - CICS Connector IBM Redbook

SG24-2183 SG24-2232 SG24-5265

Enterprise Connectors Web Sites


IBM Connectors page Lotus ETMLinks
Copyright IBM Corp. 2001, 2004

www.software.ibm.com/webservers/connectors www.software.ibm.com/ts/lotus_connections
Appendix E. Bibliography and References E-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Lotus Enterprise Integrator

www.eicentral.lotus.com www.edge.lotus.com www.software.ibm.com/ts/mqseries

MQSeries MQSeries for Windows NT MQSeries Security White Paper Net.Data eNetwork Host On-Demand Gateway CICS GW for Java & CICS Client IMS IMS Web DCE Encina Lightweight Client Domino.Connect/Notes Pump

www.software.ibm.com/ts/mqseries www.software.ibm.com/ts/mqseries/platforms/nt www.software.ibm.com/ts/mqseries/txppacs/ms06.html www.software.ibm.com/data/net.data www.software.ibm.com/enetwork/hostondemand CICS Internet www.software.ibm.com/webservers/connectors www.software.ibm.com/webservers/connectors www.software.ibm.com/data/ims/whatsnew.html www.software.ibm.com/data/ims/imsweb.html www.transarc.com/Product/TXSeries/DELight2.0/index 11.html www.edge.lotus.com/eibu_knowbase.nsf

Books
Web Gateway Tools: Connecting IBM & Lotus Applications to the Web Summary at www.ibm.com/technology/books/webgate Internet Application Development with MQSeries and Java IBM Redbook MQSeries Security: Example of Using a Channel Security Exit, Encryption and Decryption IBM Redbook

SR23-7862

SG24-4896

SG24-5306-00

E-4

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP SG24-5241 SG24-5277-00 SG24-2534-01 SG24-4547 SG24-2220 SG24-2217 SR23-7604 SG24-4918

Developing Distributed Transaction Applications with Encina IBM Redbook Revealed! CICS Transaction Gateway with More CICS Clients Unmasked IBM Redbook CICS Clients Unmasked IBM Redbook Accessing CICS Business Applications from the World Wide Web IBM Redbook Connecting IMS to the World Wide Web: A Practical Guide to IMS Connectivity IBM Redbook Lotus Solutions for the Enterprise IBM Redbook 60 Minute Guide to LotusScript 3 Lotus Solutions for the Enterprise, Volume 2, Using DB2 in a Domino Environment IBM Redbook Lotus Solutions for the Enterprise, Volume 1-5 Enterprise Integration with Domino for S/390 IBM Redbook

RDB Connectivity Web Sites


IBM DB2 Universal Database Home Page DB2 UDB - Fact Sheet

www.software.ibm.com/data/db2/udb www.software.ibm.com/data/db2/udb/abo ut.html

eCommerce Web Sites


Net.Commerce www.software.ibm.com/commerce/net.commerce

Copyright IBM Corp. 2001, 2004

Appendix E. Bibliography and References

E-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Books
Integrating Net.Commerce with Legacy Applications IBM Redbook

SG24-4933

Systems Management Web Sites


Tivoli Tivoli Enterprise www.tivoli.com www.tivoli.com/o_products/html/body_products. html

Books
Managing Access from Desktop to Datacenter: Introducing TME IBM Redbook Measuring Lotus Notes Response Times with Tivoli's ARM Agents IBM Redbook Managing a Notes Environment w/ TME 10 Module for Notes IBM Redbook TME 10 Deployment Cookbook: Inventory IBM Redbook

SG24-2021 SG24-4787 SG24-2104 SG24-2120

Framework Infrastructure Java Web Sites


IBM Java Home Page IBM Daily Grounds Page Sun Java Home Page www.ibm.com/java/home.html javausers.ihost.com/dailygrounds java.sun.com

E-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Java Shareware Java Applets (Gamelan) Lotus eSuite CORBA Sun JavaBeans Page Sun JavaBeans Directory Enterprise JavaBeans San Francisco What IBM Is Doing With Java IBM WebSphere Application Server

www.javashareware.com www.gamelan.com esuite.lotus.com www.omg.org java.sun.com/beans java.sun.com/beans/directory www.javasoft.com/marketing/enterprise/index.ht ml www.ibm.com/java/sanfrancisco www.ibm.com/Java/assistance/ibm-java.html www.software.ibm.com/webservers/appserv

Discussions and Forums


IBM Java Team Digital Espresso (weekly summary of comp.lang.java) [email protected] www.mentorsoft.com/DE

Books and Magazines


Addison Wesley Java page (major Java publisher) Prentice Hall page (major Java publisher) SR23-8023 SR23-8064 SR23-7394-01 SR23-8018

www.awl.com/cseng/javaseries www.prenhall.com Not Just Java, Peter Van der Linden Teach Yourself Java 1.1 in 21 Days, 2nd Edition, Laura Lemay & Charles L. Perkins Java in a Nutshell, 2nd Edition Java Fundamental Classes Reference

Copyright IBM Corp. 2001, 2004

Appendix E. Bibliography and References

E-7

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

SR23-7895 SR23-7787 SG24-2247 SG24-2109 SG24-2216 SG24-7006 SG24-2022 JavaWorld magazine online Web Review Articles and Tutorials on Servlets

JavaBeans for Dummies Client/Server Programming with Java and CORBA, Robert Orfali and Dan Harkey From Client/Server to Network Computing, A Migration to Java Java Network Security Creating Java Applications Using NetRexx Cooking with Beans in the Enterprise Component Broker Connector Overview www.javaworld.com webreview.com/97/10/10/feature/index.html

Security/Directory/Performance Web Sites


IBM WebSphere Performance Pack IBM Solutions for ISPs SecureWay IBM Firewall Cryptolopes SET Registry CommercePoint Internet Scale DFS WebTraffic Express www.software.ibm.com/webservers/perfpack/index .html www.ibm.com/isp www.ibm.com/security www.software.ibm.com/enetwork/firewall www.software.ibm.com/security/cryptolope www.software.ibm.com/commerce/payment www.software.ibm.com/commerce/registry www.software.ibm.com/commerce www.transarc.com/Product/EFS/DFS/InterScale.ht ml www.transarc.com/dfs/public www.ics.raleigh.ibm.com/WebTraffic Express

E-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

eNetwork Dispatcher eNetwork Directory LDAP Roadmap

www.software.ibm.com/enetwork/dispatcher www.software.ibm.com/enetwork/directory www.kingsmountain.com/ldapRoadmap.shtml

Books
SR28-5685 SG24-4978 SG24-4993 SG24-5220 SG24-5233 Web Proxy Servers Managing AFS: Internet Firewalls and Network Security, 2nd edition Secure Electronic Transactions: Credit Card Payments on the Web in Theory and Practice IBM Interactive Network Dispatcher: Load-Balancing Internet Servers Internet Security in the Network Computing Framework IBM WebSphere Performance Pack Usage and Administration Ari Luotonen Prentice Hall The Andrew File System, Richard Campbell, IBM Web Traffic Express for Multiplatforms User's Guide

Related Topics/Technologies IBM Open Blueprint and Other Architectures Web Sites
IBM Open Blueprint Oracle NCA Microsoft DNA www.software.ibm.com/openblue www.oracle.com/nca www.microsoft.com/dna/overview.asp

Copyright IBM Corp. 2001, 2004

Appendix E. Bibliography and References

E-9

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

HTML/CGI Web Sites


World Wide Web Consortium HTML Specifications HTML special characters CGI Tutorial IBM XML Page

www.w3.org www.utoronto.ca/webdocs/HTMLdocs www.lightsphere.com/dev/class www.software.ibm.com/xml

Books
SR23-7816 SR23-7979 SR23-7711 HTML Sourcebook, 3rd Edition Platinum Edition Using HTML Java and CGI - Eric Ladd and Jim O'Donnell HTML 3.2 & CGI Unleashed

Security Web Sites


International Computer Security Association (ICSA)

www.icsa.net

Other Vendors

Web Sites
Adaptivity Apache (freeware) www.adaptivity.com www.apache.org www.apacheweek.com

E-10 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

BEA (Tuxedo) Borland/Inprise Gemstone Kiva (Netscape Application Server) NCR Net Dynamics Novell Novera Persistence SilverStream SuperCede Sybase WebLogic

www.beasys.com www.inprise.com www.gemstone.com www.netscape.com/appserver/v2.1/index. html www.ncr.com www.netdynamics.com www.novell.com www.novera.com www.persistence.com www.silverstream.com www.supercede.com www.sybase.com www.weblogic.com

Miscellaneous

Web Sites
www.developer.ibm.com/welcome/java/javamap.ht ml www.networking.ibm.com

IBM Solution Developer Program Networking Technologies

Miscellaneous Publications

Web Sites
IBM Redbooks InfoWorld magazine
Copyright IBM Corp. 2001, 2004

www.redbooks.ibm.com www.infoworld.com
Appendix E. Bibliography and References E-11

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Information Week magazine Network World magazine Web Week magazine

www.informationweek.com www.nwfusion.com www.internetworld.com

Internal Pages

e-business/Framework

Web Sites
IBM e-business IBM e-business Insider IBM's e-business Strategy Internet Division's Information Center NCF SWGTechnology Page IBM IT Solution Architect 5 Minute University for NCF w3.ibm.com/e-business w3.software.ibm.com/ebusiness w3.strategy.ibm.com w3.nc.ibm.com ncf.austin.ibm.com/swgtc w3.ncs.ibm.com/cspaper.nsf - choose Topic, choose Architecture Briefs, choose Network Computing Framework

Java
Web Sites IBM Centre for Java Technology Development Java Information Hub The Java Special Interest Group VisualAge for Java ncc.hursley.ibm.com/javainfo/hurindex.html w3.java.ibm.com/Index.html w3.hursley.ibm.com/java/sig ncc.hursley.ibm.com/javainfo/vajava

E-12 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Enterprise Connectors
Web Sites MQ Flashes Hursley Demo page V06DBL02.hursley.ibm.com/m_dir/MQFlash.nsf tsdemoteam.hursley.ibm.com

Miscellaneous

Web Sites
Competitive Information Solution Developer Marketing Cross Platform Integration Test team (CPIT) ITSO Site gdlncntr.endicott.ibm.com/nclibrary/microsft.nsf w3sdo.austin.ibm.com/depts/ssqa/ncteam/ncvirtual. html w3.ncs.ibm.com/cpit/cpithome.nsf w3.itso.ibm.com

Customer Reference Information Web Sites


Internal Reference Web Site

w3.ibm.com/e-business

Books
(Available on the internal ITSO site w3.itso.ibm.com)

Books
GG24-3376-05 TCP/IP Tutorial and Technical Overview

Copyright IBM Corp. 2001, 2004

Appendix E. Bibliography and References

E-13

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

ZZ81-0475-00

An Approach to Designing e-business Solutions IBM WebSphere Performance Pack Usage and Administration

E-14 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Appendix F. Acronyms and Abbreviations


A
ACL - Access Control List ADC - Advanced Data Connector ADO - ActiveX Data Objects AFC - Application Foundation Class AH - Authentication Header API - Application Programming Interface APPC - Advanced Program-to-Program Communication AS - Application Systems ASP - Active Server Pages ATM - Asynchronous Transfer Mode AWT - Abstract Windowing Toolkit

B
B2B - Business to Business B2C - Business to Consumer BO - Business Object

C
C & S - Calendar and Schedule CA - Certification Authority CAE - Client Application Enabler CARP - Cache Array Routing Protocol CB - Component Broker CCF - Common Connector Framework CDF - Channel Definition Format CDK - Component Development Kit CDSA - Common Data Security Architecture CGI - Common Gateway Interface CICS - Customer Information Control System
Copyright IBM Corp. 2001, 2004 Appendix F. Acronyms and Abbreviations F-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

CIG - CICS Internet Gateway CIO - Chief Information Officer CMC - Communications Management Configuration COBOL - Common Business Oriented Language CORBA - Common Object Request Broker Architecture CRLF - Carriage Return / Line Feed CSR - Customer Service Representative CSS - Cascading Style Sheet

D
DAO - Data Access Objects DAP - Directory Access Protocol DBCS - Double Byte Character Set DB2 - Database 2 DBMS - Database Management System DCE - Distributed Computing Environment DCOM - Distributed Component Object Model DDM - Device Descriptor Module DECS - Domino Enterprise Connection Services DHTML - Dynamic Hypertext Markup Language DII - Dynamic Invocation Interface DIT - Directory Information Tree DN - Distinguished Name DNA - Microsoft Distributed interNet Applications Architecture DNS - Domain Name System DO - Data Object DPL - Distributed Program Link DRDA - Distributed Relational Database Architecture DRP - Distribution and Replication DSI - Dynamic Skeleton Interface

F-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

E
E2E - End-to-End ECI - External Call Interface EJB - Enterprise JavaBean EJS - Enterprise JavaBeans Server eND - eNetwork Dispatcher EPI - External Presentation Interface ESP - Encapsulation Security Protocol EXCI - External CICS Interface FAT - File Allocation Table FIN - used in IP for socket termination Framework - Application Framework for e-business FTP - File Transfer Protocol FW - Firewall

G
Gbyte - Gigabyte GIF - Graphic Interchange Format GIOP - General Inter-ORB Protocol GSO - Global Sign-On GUI - Graphical User Interface GW - Gateway GWAPI - Go Webserver Application Programming Interface

H
HACMP - High-Availability Cluster Multi-Processing HP - Hewlett-Packard HPFS - High Performance File System HTML - Hypertext Mark-up Language

I
ICAPI - Internet Connection API

Copyright IBM Corp. 2001, 2004

Appendix F. Acronyms and Abbreviations

F-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

ICSS - Internet Connection Secure Server IDE - Integrated Development Environment IDL - Interface Definition Language IE - Internet Explorer IETF - Internet Engineering Task Force IIOP - Internet Inter-ORB Protocol IMAP - Internet Mail (or Message) Access Protocol IMS - Information Management System IP - Internet Protocol ISAPI - Internet Server API ISC - Intersystem Communication ISP - Internet Service Provider I/T - Information Technology

J
JAR - Java Archive JDBC - trademark, often referred to as "Java Database Connectivity" JDK - Java Developer's Kit JFC - Java Foundation Class JIT - Just In Time JNDI - Java Naming and Directory Interface JNI - Java Native Interface JPEG - Joint Photographic Experts Group JRE - Java Runtime Environment JSP - Java Server Pages JSQL - Java Structured Query Language JVM - Java Virtual Machine

L
LDAP - Lightweight Directory Access Protocol LEI - Lotus Enterprise Integrator LS:DO - LotusScript Data Object

F-4

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

LSX - Lotus Script Extensions LUM - Logical Unit Manager LUW - Logical Unit of Work

M
MAPI - Messaging API MATM - MQSeries Link Application Transaction Map MB - Megabyte MCF - Meta Content File MFS - Message Format Services MHz - Megahertz MIME - Multipart Internet Mail Extension MOM - Message-Oriented Middleware MPR - Message Processing Region MQ - Message Queue MQEI - Message Queue Enterprise Integrator MQI - Message Queue Interface MQIIH - MQ IMS Information Header MS - Microsoft MSMQ - Microsoft Message Queue MTA - Message Transfer Agent MVS - Multiple Virtual Storage MW - Middleware

N
NC - Network Computer or Network Computing NCF - Network Computing Framework NCI - Network Communications Interface NCSA - National Computer Security Association NDS - Novell Directory Services NIS - Network Information Services NNTP - Network News Transfer Protocol

Copyright IBM Corp. 2001, 2004

Appendix F. Acronyms and Abbreviations

F-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

NOF - NetObjects Fusion NSAPI - Netscape Server API NSF - Notes database file extension NSTP - Notification Service Transfer Protocol NT - Windows NT (New Technology) NTFS - NT File System

O
OCX - Open Connect Exchange (OLE Custom Control) ODBC - Open Database Connectivity OLTP - On-line Transaction Processing OMG - Object Management Group OO - Object-Oriented ORB - Object Request Broker OSC - Open System Center OTMA - Open Transaction Manager Access

P
P&P - Policies and Procedures (security document) PCMCIA - Personal Computer Memory Card International Association PD - Problem Determination PERL - Practical Extraction & Reporting Language PICS - Platform for Internet Content Selection PKI - Public Key Infrastructure PKIX - Public Key Infrastructure Standard POP - Post Office Protocol

R
RACF - Resource Access Control Facility RAD - Rapid Application Development RDB - Relational Database RDBMS - Relational Database Management System RDN - Relative Distinguished Name
F-6 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

RDO - Remote Data Objects RFC - Remote Function Call RMI - Remote Method Invocation RMIC - Remote Method Invocation Compiler RPC - Remote Procedure Call RSA - Rivest-Shamir-Adleman algorithm

S
SASL - Simple Authentication and Security Layer SET - Secure Electronic Transaction SHTTP - Secure Hypertext Transfer Protocol SMP - Symmetric Multiprocessors SMTP - Simple Mail Transfer Protocol SNA - Systems Network Architecture SPI - Service Provider Interface SQL - Structured Query Language SSL - Secure Sockets Layer SYN - used in IP for socket connection

T
TCP - Transmission Control Protocol TCP/IP - Transmission Control Protocol / Internet Protocol Telnet - U.S. Dept. of Defense virtual terminal protocol TLS - Transport Layer Security TME - Tivoli Management Environment TP - Transaction Processor TR - Token-Ring T-RPC - Transactional Remote Procedure Call TXSeries - Transaction Series

U
UA - User Agent URI - Uniform Resource Identifier
Copyright IBM Corp. 2001, 2004 Appendix F. Acronyms and Abbreviations F-7

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

URL - Uniform Resource Locator

V
VA - VisualAge VAJ - VisualAge for Java VB - Visual Basic VIM - Vendor Independent Messaging VM - Virtual Machine VPN - Virtual Private Network VTAM - Virtual Telecommunications Access Method

W
WAS - WebSphere Application Server WDS - WebSphere Development Studio WTE - Web Traffic Express WISIWIG - What You See Is What You Get W3C - World Wide Web Consortium

X
XCF - Cross-system Coupling Facility XML - Extensible Markup Language

F-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Appendix G. Glossary
A

abstract class

A class that provides common information for subclasses, and therefore cannot be instantiated. Abstract classes provide at least one abstract method. A method with a signature, but no implementation. You provide the implementation of the method in the subclass of the abstract class that contains the abstract method. The Abstract Window Toolkit API provides a layer between the application and the host's windowing system. It enables programmers to port Java applications from one window system to another. The AWT provides access to basic interface components such as events, colors, fonts, and controls such as buttons, scroll bars, text fields, frames, windows, dialogs, panels, canvases, and check boxes. A VisualAge for Java ToolKit for developing Java beans, Java applications, or Java applets that access SAP business objects. The Access Builder for SAP R/3 consists of R/3 Access Classes, Business Object Repository Access Classes, Logon Java beans, and the Access Builder tool. In VisualAge Developer Domain, the level at which you connect to the Web site. We provide the following access levels: ! Registration ! Subscription for Java ! Subscription for Java, CD-ROM version ! Enterprise Download Components

abstract method

Abstract Window Toolkit (AWT)

Access Builder for SAP R/3

access level

actual parameter list AFC API applet

Parameters specified in a call to a method. See also formal parameter list. See Application Foundation Classes. See Application Programming Interface. A Java program designed to run within a Web browser. Contrast with application.

Copyright IBM Corp. 2001, 2004

Appendix G. Glossary

G-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

application

In Java programming, a self-contained, stand-alone Java program that includes a static main method. It does not require an applet viewer. Contrast with applet. Microsoft's version of the Java Foundation Classes (JFCs). AFCs deliver similar functions to JFCs but only work on Windows 32-bit platforms. A software interface that enables applications to communicate with each other. An API is the set of programming language constructs or statements that can be coded in an application program to obtain the specific functions and services provided by an underlying operating system or service program. American Standard Code for Information Interchange. A standard assignment of 7-bit numeric codes to characters. See also Unicode. See Abstract Window Toolkit

Application Foundation Classes (AFCs) application programming interface (API)

ASCII

AWT

base type bean BeanInfo

In Java, a type that establishes an interface to anything inherited from itself. See type, derived type. A definition or instance of a JavaBeans component. See JavaBeans. 1) A Java class that provides explicit information about the properties, events, and methods of a bean class. (2) In the VisualAge for Java Integrated Development Environment, a page in the class browser that provides bean information. (1) In VisualAge for Java, a window that provides information on program elements. There are browsers for projects, packages, classes, methods, and interfaces. (2) An Internet-based tool that lets users browse Web sites.

browser

G-2

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP business object (1) An object that represents a business function. Business objects contain attributes that define the state of the object, and methods that define the behavior of the object. A business object also has relationships with other business objects. Business objects can be used in combination to perform a desired task. Typical examples of business objects are Customer, Invoice, or Account. (2) In the Enterprise Access Builder, a class that implements the IBusinessObject interface. Machine-independent code generated by the Java compiler and executed by the Java interpreter.

bytecode

casting C++ Access Builder

Explicitly converting an object or primitive's data type. A VisualAge for Java, Enterprise Edition tool that generates beans and C++ wrappers that let your Java programs access C++ DLLs. See Subscription for Java, CD-ROM version. A server program that processes CICS ECI calls, forwarding transaction requests to a CICS program running on a host. An API that provides C and C++ programs with procedural access to transactions. A server program that processes Java ECI calls and forwards CICS ECI calls to the CICS Client. An encapsulated collection of data and methods to operate on the data. A class may be instantiated to produce an object that is an instance of the class. The relationships between classes that share a single inheritance. All Java classes inherit from the Object class. Methods that apply to the class as a whole rather than its instances (also called a static method).

CD subscription CICS Client CICS ECI CICS Gateway for Java class

class hierarchy class method

Copyright IBM Corp. 2001, 2004

Appendix G. Glossary

G-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

class path

When running a program in VisualAge for Java, a list of directories and JAR files that contain resource files or Java classes that a program can load dynamically at run time. A program's class path is set in its Properties notebook. In your deployment environment, the environment variable keyword that specifies the directories in which to look for class and resource files. Variables that apply to the class as a whole rather than its instances (also called a static field). A networked computer in which the IDE is connected to a repository on a team server. An attribute of the <APPLET> tag that provides the relative pathname for the classes. Use this attribute when your class files reside in a different directory than your HTML files. In the Enterprise Access Builder, interface and class definitions that provide a consistent means of interacting with enterprise resources (for example, CICS and Encina transactions) from any Java execution environment. A specification produced by the Object Management Group (OMG) that presents standards for various types of object request brokers (such as client-resident ORBs, server-based ORBs, system-based ORBs, and library-based ORBs). Implementation of CORBA standards enables object request brokers from different software vendors to interoperate. A set of Java interfaces and classes that defines a middleware-independent layer to access R/3 systems from Java. If applications are built on top of this interface, they can leverage different middleware at run time without recoding. The generated beans are based on this interface and provide the same flexibility. An architecture and an API that allows developers to define reusable segments of code that can be combined to create a program. VisualAge for Java uses the JavaBeans component model. A bean that can contain both visual and nonvisual components. A composite bean is composed of embedded beans.
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

CLASSPATH

class variable client codebase

Common Connector Framework

Common Object Request Broker Architecture (CORBA)

Common RFC Interface for Java

component model

composite bean

G-4

Introduction to XML

V3.1.0.1
Student Notebook

AP connection In the VisualAge for Java Visual Composition Editor, a visual link between two components that represents the relationship between the components. Each connection has a source, a target, and other properties. In VisualAge for Java, the window that acts as the standard input (System.in) and standard output (System.out) device for programs running in the VisualAge for Java environment. A method called to set up a new instance of a class. A component that can hold other components. In Java, examples of containers include applets, frames, and dialogs. In the Visual Composition Editor, containers can be graphically represented and generated. (1) A small file stored on an individual's computer; this file allows a site to tag the browser with a unique identification. When a person visits a site, the site's server requests a unique ID from the person's browser. If this browser does not have an ID, the server delivers one. On the Wintel platform, the cookie is delivered to a file called 'cookies.txt,' and on a Macintosh platform, it is delivered to 'MagicCookie.' Just as someone can track the origin of a phone call with Caller ID, companies can use cookies to track information about behavior. (2) Persistent data stored by the client in the Servlet Builder.t___ Common Objects Request Broker Architecture. Part of the minimal set of APIs that form the standard Java Platform. Core APIs are available on the Java Platform regardless of the underlying operating system. The Core API grows with each release of the JDK; the current core API is based on JDK 1.1.Also called core classes.

Console

constructor container

cookie

CORBA Core API

Data Access Bean

In the VisualAge for Java Visual Composition Editor, a bean that accesses and manipulates the content of JDBC/ODBC-compliant relational databases.

Copyright IBM Corp. 2001, 2004

Appendix G. Glossary

G-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Data Access Builder

A VisualAge for Java Enterprise tool that generates beans to access and manipulate the content of JDBC/ODBC-compliant relational databases. A component that assists in analyzing and correcting coding errors. Statement that creates an identifier and its attributes, but does not reserve storage or provide an implementation. Statement that reserves storage or provides an implementation. An obsolete component that may be deleted from a future version of a product. In Java, a type that overrides the definitions of a base type to provide unique behavior. The derived type extends the base type A metaphor, introduced by BeanExtender on alphaWorks, for modifying a component by hooking a special kind of Java bean onto it. Dipping lets you add new behavior or modify the Java bean's existing behavior without having to mess around with the Java bean's code. A dip is a special kind of Java bean that can be hooked on to another Java bean; it is the new feature you want to add to the component. Software examples of dips include printing and security. Dippable Java beans can have one or more dips connected to them. Almost any Java bean or class can be made dippable by extending it, a process called morphing. A special kind of Java bean that can be hooked on to another Java bean; the new feature you want to add to the component. Software examples of dips include printing and security. Processing that takes place across two or more linked systems.

debugger declaration definition deprecation derived type

dipping

dip

distributed processing

G-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP DLL (dynamic link library) A file containing executable code and data bound to a program at load time or run time, rather than during linking. The code and data in a dynamic link library can be shared by several applications simultaneously. The C++ Access Builder generates beans and C++ wrappers that let your Java programs access C++ DLLs. Enterprise Access Builders also generate platform-specific DLLs for the workstation and OS/390 platforms. A floating-point number that contains 64 bits. See also single precision.

double precision

EAB e-business

See Enterprise Access Builder. Either (a) the transaction of business over an electronic medium such as the Internet or (b) a business that uses Internet technologies and network computing in their internal business processes (via intranets), their business relationships (via extranets), and the buying and selling of goods, services, and information (via electronic commerce.) The subset of e-business that involves the exchange of money for goods or services purchased over an electronic medium such as the Internet. An API and application environment for high-volume embedded devices, such as mobile phones, pagers, process control, instrumentation, office peripherals, network routers and network switches. EmbeddedJava applications run on real-time operating systems and are optimized for the constraints of small-memory footprints and diverse visual displays The grouping of both data and operations into neat, manageable units that can be developed, tested, and maintained independently of one another. Such grouping is a powerful technique for building better software. The object manages its own resources and limits their visibility.

e-commerce

EmbeddedJava

encapsulation

Copyright IBM Corp. 2001, 2004

Appendix G. Glossary

G-7

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Enterprise Access Builder (EAB) Enterprise Download Components

Feature of Visual Age for Java, Enterprise Edition, that creates connectors to enterprise server products such as CICS, Encina, IMS TOC, and MQSeries. An access level for VisualAge Developer Domain that includes download versions of the latest components for VisualAge for Java, Enterprise Edition. You receive this access level when you purchase VisualAge for Java, Enterprise Edition, Version 2.0. See VisualAge for Java, Enterprise Edition Includes Enterprise JavaBeans as well as open API specifications for: database connectivity, naming and directory services, CORBA/IIOP interoperability, pure Java distributed computing, messaging services, managing system and network resources, and transaction services. A cross-platform component architecture for the development and deployment of multitier, distributed, scalable, object-oriented Java applications. A set of VisualAge for Java Enterprise tools that enable you to develop Java code that is targeted to specific platforms, such as AS/400, OS/390, OS/2, AIX, and Windows. See VisualAge for Java, Entry Edition. An action by a user, program, or system that may trigger specific behavior. In the JDK, events notify the relevant listener classes to take appropriate action. An exception is an object that has caused some sort of new condition, such as an error. In Java, throwing an exception means passing that object to an interested party; a signal indicates what kind of condition has taken place. Catching an exception means receiving the sent object. Handling this exception usually means taking care of the problem after receiving the object, although it might mean doing nothing (which would be bad programming practice). Code that runs from within an HTML file (such as an applet). A subclass or interface extends a class or interface if it add fields or methods, or overrides its methods. See also derived type.

Enterprise Edition Enterprise Java

Enterprise JavaBeans

Enterprise ToolKit

Entry Edition event

exception

executable content extends

G-8

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

factory field first tier framework free-form surface

A bean that dynamically creates instances of beans. A data object in a class; for example, a variable. The client; the hardware and software with which the end user interacts. A set of object classes that provide a collection of related functions for a user or piece of software. In the VisualAge for Java Visual Composition Editor, the large, open area where you can work with visual and nonvisual beans. You add, remove, and connect beans on the free-form surface. In the Internet suite of protocols, an application layer protocol that uses TCP and Telnet services to transfer bulk-data files between machines or hosts. A generated class representing the HTML form elements in a visual servlet. See File Transfer Protocol. Parameters specified in a method's definition. See also actual parameter list.

File Transfer Protocol (FTP) form data FTP formal parameter list

garbage collection

Java's ability to clean up inaccessible unused memory areas ("garbage") on the fly. Garbage collection slows performance, but keeps the machine from running out of memory. A type of computer interface consisting of a visual metaphor of a real-world scene, often of a desktop. Within that scene are icons, representing actual objects, that the user can access and manipulate with a pointing device.

graphical user interface (GUI)

Copyright IBM Corp. 2001, 2004

Appendix G. Glossary

G-9

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

hierarchy

The order of inheritance in object-oriented languages. Each class in the hierarchy inherits attributes and behavior from its superclass, except for the top-level Object class. A Java-enabled Web and intranet browser developed by Sun Microsystems, Inc. HotJava is written in Java. (Definition copyright 1996-1999 Sun Microsystems, Inc. All Rights Reserved. Used by permission.) A file format, based on SGML, for hypertext documents on the Internet. Allows for the embedding of images, sounds, video streams, form fields and simple text formatting. References to other objects are embedded using URLs, enabling readers to jump directly to the referenced document. The Internet protocol, based on TCP/IP, used to fetch hypertext objects from remote hosts.

HotJava

Hypertext Markup Language (HTML)

Hypertext Transfer Protocol (HTTP)

IDE identifier IDL (Interface Definition Language) IDL Development Environment

See Integrated Development Environment. The name of an item in a program In CORBA, a declarative language that is used to describe object interfaces, without regard to object implementation. In VisualAge for Java, an integrated IDL and Java development environment. The IDL Development Environment allows you to work with IDL source code in the multipane IDLs page and generate Java code using an IDL-to-Java compiler. A container used to hold IDL objects in the IDL Development Environment. It is similar to a file system directory. A communications standard for distributed objects that reside in Web or enterprise computing environments.

IDL group IIOP (Internet Inter-ORB Protocol)

G-10 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP InfoBus A technology for flexible, vendor-independent data exchange which is used by eSuite and can be used by other applications to exchange data with eSuite and other InfoBus-enabled applications. The 100% Pure Java release and the InfoBus specification are available for free download from http://java.sun.com/beans/infobus. The ability to create subclasses that automatically inherit properties and methods from its superclass. See also hierarchy. In VisualAge for Java, a window in which you can evaluate code fragments in the context of an object, look at the entire contents of an object and its class, or access and modify the fields of an object. The specific representation of a class, also called an object. A method that applies and operates on objects (usually called simply a method). Contrast with class method. A variable that defines the attributes of an object. The class defines the instance variable's type and identifier, but the object sets and changes its values. In VisualAge for Java, the set of windows that provide the user with access to development tools. The primary windows are the Workbench, Log, Console, Debugger, and Repository Explorer. A list of methods that enables a class to implement the interface itself by using the implements keyword. The Interfaces page in the Workbench lists all interfaces in the workspace. In the Internet suite of protocols, a connectionless protocol that routes data through a network or interconnected networks. IP acts as an intermediary between the higher protocol layers and the physical network. However, this protocol does not provide error recovery and flow control and does not guarantee the reliability of the physical network. A tool that edits and generates CORBA-compliant Java modules. See Common Object Request Broker Architecture (CORBA).

inheritance

Inspector

instance instance method instance variable

Integrated Development Environment (IDE) interface

Internet Protocol (IP)

Internet Inter-ORB Protocol (IIOP) Access Builder

Copyright IBM Corp. 2001, 2004

Appendix G. Glossary

G-11

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

interpreter introspection

A tool that translates and executes code line-by-line. For a JavaBean to be reusable in development environments, there needs to be a way to query what the bean can do in terms of the methods it supports and the types of event it raises and listens for. Introspection allows a builder tool to analyze how a bean works. See Internet Protocol

IP

JAE JAR file format

See Java Application Environment. JAR (Java Archive) is a platform-independent file format that aggregates many files into one. Multiple Java applets and their requisite components (.class files, images, sounds and other resource files) can be bundled in a JAR file and subsequently downloaded to a browser in a single HTTP transaction. An object-oriented programming language for portable, interpretive code that supports interaction among remote objects. Java was developed and specified by Sun Microsystems, Incorporated. The Java environment consists of the JavaOS, the Virtual Machines for various platforms, the object-oriented Java programming language, and several class libraries. The source code release of the Java (TM) Development Kit. (Definition copyright 1996-1999 Sun Microsystems, Inc. All Rights Reserved. Used by permission.) Java's component architecture, developed by Sun, IBM, and others. The components, called Java beans, can be parts of Java programs, or they can exist as self-contained applications. Java beans can be assembled to create complex applications, and they can run within other component architectures (such as ActiveX and OpenDoc). In the JDK, the specification that defines an API that enables programs to access databases that comply with this standard.
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Java

Java Application Environment (JAE) JavaBeans

Java Database Connectivity (JDBC)

G-12 Introduction to XML

V3.1.0.1
Student Notebook

AP Java Development Kit (JDK) The Java Development Kit is the set of Java technologies made available to licensed developers by Sun Microsystems. Each release of the JDK contains the following: the Java Compiler, Java Virtual Machine, Java Class Libraries, Java Applet Viewer, Java Debugger, and other tools. Sun's tool for generating HTML documentation on classes by extracting comments from the Java source code files. Developed by Netscape, Sun, and IBM, JFCs are building blocks that are helpful in developing interfaces to Java applications. They allow Java applications to interact more completely with the existing operating systems. Also called Swing Set. Java IDL is a language-neutral way to specify an interface between an object and its client on a different platform. Provides interoperability and integration with CORBA, the industry standard for distributed computing, allowing developers to build Java applications that are integrated with heterogeneous business information assets. A specification proposed by Sun Microsystems that defines a core set of application programming interfaces for developing tightly integrated system, network, and service management applications. The application programming interfaces could be used in diverse computing environments that encompass many operating systems, architectures, and network protocols. Allows developers to integrate a wide range of media types into their Web pages, applets, and applications. Includes: Media, Sound, Animation, 2D, 3D, Telephony, Speech and Collaboration. The JMF API specifies a unified architecture, messaging protocol and programming interface for media players, capture and conferencing. JMF provides a set of building blocks useful by other areas of the Java Media API suite. For example, the JMF provides access to audio devices in a cross-platform, device-independent manner, which is required by both the Java Telephony and the Java Speech APIs. JMF will be published as three APIs: the Java Media Player, Java Media Capture, and Java Media Conference.

JavaDoc Java Foundation Classes (JFC)

Java IDL

Java Management Application Programming Interface (JMAPI)

Java Media and Communications APIs

Java Media Framework (JMF)

Copyright IBM Corp. 2001, 2004

Appendix G. Glossary

G-13

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Java Naming and Directory Interface (JNDI) Java Native Interface (JNI )

A set of APIs that assist with the interfacing to multiple naming and directory services. (Definition copyright 1996-1999 Sun Microsystems, Inc. All Rights Reserved. Used by permission.) A native programming interface that allows Java code running inside a Java Virtual Machine (VM) to interoperate with applications and libraries written in other programming languages, such as C and C++. In Remote Method Invocation, the name of the user-defined default file that contains a list of server objects to be instantiated when the Remote Object Instance Manager is started. A basic, small-footprint operating system that supports Java. Java OS was originally designed to run in small electronic devices like phones and TV remotes, but it is also being targeted for use in network computers (NCs). The Java Virtual Machine and the Java Core classes make up the Java Platform. The Java Platform provides a uniform programming interface to a 100% Pure Java program regardless of the underlying operating system. (Definition copyright 1996-1999 Sun Microsystems, Inc. All Rights Reserved. Used by permission.) An editor that allows you to construct and refine dynamic record types. A Java framework that describes and converts record data. Method invocation between peers, or between client and server, when applications at both ends of the invocation are written in Java. Included in JDK 1.1. A subset of the Java Development Kit for end-users and developers who want to redistribute the JRE. The JRE consists of the Java Virtual Machine, the Java Core Classes, and supporting files. (Definition copyright 1996-1999 Sun Microsystems, Inc. All Rights Reserved. Used by permission.)

JavaObjs

JavaOS

Java Platform

Java Record Editor Java Record Framework Java Remote Method Invocation (RMI) Java Runtime Environment (JRE)

G-14 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP JavaScript A scripting language used within an HTML page. Superficially similar to Java but JavaScript scripts appear as text within the HTML page. Java applets, on the other hand, are programs written in the Java language and are called from within HTML pages or run as stand-alone applications. A framework for developers to include security functionality in their applets and applications. Includes: cryptography with digital signatures, encryption, and authentication. An intermediate subset of the Security API known as "Security and Signed Applets" is included in JDK 1.1. An extensible framework that enables and eases the development of Java-powered Internet and intranet servers. The APIs provide uniform and consistent access to the server and administrative system resources required for developers to quickly develop their own Java servers. A software implementation of a central processing unit (CPU) that runs compiled Java code (applets and applications). IBM's powerful Java search engine, accessible from the Search field at the top of every VisualAge Developer Domain page. Simply select jCentral in the in entry field, and jCentral searches the entire Web for Java information and Java components such as applets, Java beans, and EJBs. Search results are sorted by relevance. See Java Database Connectivity. See Java Foundation Classes. See Just-In-Time Compiler. See Java Media Framework. See Java Naming and Directory Interface. See Java Native Interface. See Java Runtime Environment. A platform-specific software compiler often contained within JVMs. JITs compile Java bytecodes on-the-fly into native machine instructions, thereby reducing the need for interpretation. See Java Virtual Machine.
Appendix G. Glossary G-15

Java Security API

Java Server

Java Virtual Machine (JVM) jCentral

JDBC JFC JIT JMF JNDI JNI JRE Just-In-Time compiler (JIT)

JVM

Copyright IBM Corp. 2001, 2004

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

linker

A computer program for creating load modules from one or more object modules or load modules by resolving cross references among the modules and, if necessary, adjusting addresses. In Java, the linker creates an executable from compiled classes. In the JDK, a class that receives and handles events. A variable declared and used within a method or block. In the VisualAge for Java IDE, the window that displays messages and warnings during development.

listener local variable Log

member

(1) In the Java language, an item belonging to a class, such as a field or method. (2) On VADD, a site visitor who has previously registered. See registered member, registration. A fragment of Java code within a class that can be invoked and passed a set of parameters to perform a specific task A layer of software that sits between a database client and a database server, making it easier for clients to connect to heterogeneous databases. The hardware and software that resides between the client and the enterprise server resources and data. The software includes a Web server that receives requests from the client and invokes Java servlets to process these requests. The client communicates with the Web server via industry standard protocols such as HTTP and IIOP. The process of extending a Java bean to accept dips. Morphed Java beans are called dippable Java beans and can have one or more dips connected to them. Almost any Java bean or class can be made dippable. See dipping.
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

method middleware

middle tier

morphing

G-16 Introduction to XML

V3.1.0.1
Student Notebook

AP multithreaded A program where different parts can run at the same time without interfering with each other.

native class

Machine-dependent C code that can be invoked from Java. For multi-platform work, the native routines for each platform need to be implemented. See Network Computing Framework. An architecture and programming model created to help customer and industry software development teams to design, deploy, and manage e-business solutions across the enterprise. In the Internet suite of protocols, a protocol for the distribution, inquiry, retrieval, and posting of news articles that are stored in a central database. A bean that is not visible to the end user in the graphical user interface, but is visually represented on the free-form surface of the Visual Composition Editor during development. Developers can manipulate nonvisual beans only as icons; that is, they cannot edit them in the Visual Composition Editor as they can edit visual beans. Examples of nonvisual beans include beans for business logic, communication access, and database queries. See Network News Transfer Protocol.

NCF Network Computing Framework (NCF)

Network News Transfer Protocol (NNTP) nonvisual bean

NNTP

object

The principal building block of object-oriented programs. Objects are software programming modules. Each object is a programming unit consisting of related data and methods.

Copyright IBM Corp. 2001, 2004

Appendix G. Glossary

G-17

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

ORB (Object Request Broker) object-oriented design

In object-oriented programming, software that serves as an intermediary by transparently enabling objects to exchange requests and responses. A software design method that models the characteristics of abstract or real objects using classes and objects. Object-oriented design focuses on the data and on the interfaces to it. For instance, an "object-oriented" carpenter would be mostly concerned with the chair he was building, and secondarily with the tools used to make it; a "non-object-oriented" carpenter would think primarily of his tools. Object-oriented design is also the mechanism for defining how modules "plug and play." The object-oriented facilities of Java are essentially those of C++, with extensions from Objective C for more dynamic method resolution The ability to have different methods with the same identifier, distinguished by their return type, and number and type of arguments. Implementing a method in a subclass that replaces a method in a superclass.

overloading

overriding

package part

A program element that contains classes and interfaces. An existing, reusable software component. All parts created with the Visual Composition Editor conform to the JavaBeans component model, and are referred to as beans. See visual bean and nonvisual bean. In object models, a condition that allows instances of classes to be stored externally, for example in a relational database. In VisualAge for Java, a persistence framework for object models, which enables the mapping of objects to information stored in relational databases and also provides linkages to legacy data on other systems. A program executing in its own address space, containing one or more threads.
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

persistence Persistence Builder

process

G-18 Introduction to XML

V3.1.0.1
Student Notebook

AP Professional Edition program program element project property See VisualAge for Java, Professional Edition. In VisualAge for Java, a term that refers to both Java applets and applications. In VisualAge for Java, a generic term for a project, package, class, interface, or method. In VisualAge for Java, the topmost kind of program element. A project contains Java packages. An initial setting or characteristic of a bean, for example, a name, font, text, or positional characteristic.

reference registered member

An object's address. In Java, objects are passed by reference rather than by value or by pointers. In VisualAge Developer Domain, a user who has submitted the registration information. See also registration, subscriber, and Subscription for Java. The process of submitting user information to VisualAge Developer Domain. You must register in order to access technical information from the site library, such as technical articles and IBM "Redbooks". You can also access free downloads such as VisualAge for Java, Entry Edition, and information-viewing utilities such as Netscape Navigator and Lotus Freelance. To register, click Register on the VADD site masthead, and enter the requested information. See also registered member, subscription, Subscription for Java. A debugging tool that debugs code on a remote platform. SAP's open programmable interface. External applications and tools can call ABAB/4 functions from the SAP System. You can also call third party applications from the SAP System using RFC. RFC is a means for communication that allows implementation on all R/3 platforms.

registration

remote debugger Remote Function Call (RFC)

Copyright IBM Corp. 2001, 2004

Appendix G. Glossary

G-19

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

remote method invocation (RMI)

RMI is a specific instance of the more general term RPC. RMI allows objects to be distributed over the network; that is, a Java program running on one computer can call the methods of an object running on another computer. RMI and java.net are the only 100% pure Java APIs for controlling Java objects in remote systems. In Remote Method Invocation, a program that creates and manages instances of server beans through their associated server-side server proxies. RPC is a generic term referring to any of a series of protocols used to execute procedure calls or method calls across a network. RPC allows a program running on one computer to call the services of a program running on another computer. In VisualAge for Java, the permanent storage area containing all open and versioned editions of all program elements, regardless of whether they are currently in the workspace. The repository contains the source code for classes developed in (and provided with) VisualAge for Java, and the bytecode for classes imported from the file system. Every time you save a method in the IDE, it is automatically updated in the repository. See also SCM repository and shared repository In VisualAge for Java, the window from which you can view and compare editions of program elements that are in the repository. A non-code file that may be referred to from your Java program in VisualAge for Java. Examples include graphic and audio files. See Remote Method Invocation. A VisualAge for Java Enterprise tool that generates proxy beans and associated classes and interfaces so you can distribute code for remote access, enabling Java-to-Java solutions. The compiler that generates stub and skeleton files that facilitate RMI communication. This compiler can be automatically invoked by the RMI Access Builder, and can also be invoked from the Tools menu item.
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Remote Object Instance Manager remote procedure calls (RPC)

repository

Repository Explorer

resource file

RMI (Remote Method Invocation) RMI Access Builder

RMI compiler

G-20 Introduction to XML

V3.1.0.1
Student Notebook

AP RMI registry roll back RPC runtime system A server program that allows remote clients to get a reference to a server bean. The process of restoring data changed by SQL statements to the state at its last commit point.` See Remote Procedure Calls The software environment where compiled programs run. Each Java runtime system includes an implementation of the Java Virtual Machine.

sandbox

A restricted environment, provided by the Web browser, in which Java applets run. The sandbox offers them services and prevents them from doing anything naughty, such as doing file I/O or talking to strangers (servers other than the one from which the applet was loaded). The analogy of applets to children led to calling the environment in which they run the "sandbox.". See software configuration management. In VisualAge for Java, a generic term for the data store of any external software configuration management (SCM) tool. Some SCM tools refer to this as an archive. Determines where an identifier can be used. In Java, instance and class variables have a scope that extends to the entire class. All other identifiers are local to the method where they are declared. In VisualAge for Java, the window from which you can write, edit, and test fragments of code without having to define an encompassing class or method. SSL is a security protocol which allows communications between a browser and a server to be encrypted and secure. SSL prevents eavesdropping, tampering or message forgery on your Internet or intranet network.

SCM SCM repository

scope

Scrapbook

secure socket layer (SSL)

Copyright IBM Corp. 2001, 2004

Appendix G. Glossary

G-21

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

security

Features in Java that prevent applets downloaded off the Web from deliberately or inadvertently doing damage. One such feature is the digital signature, which ensures that an applet came unmodified from a reputable source. Turning an object into a stream, and back again. The computer that hosts the Web page that contains an applet. The .class files that make up the applet, and the .HTML files that reference the applet reside on the server. When someone on the Internet connects to a Web page that contains an applet, the server delivers the .class files over the Internet to the client that made the request. The server is also known as the originating host.___ The bean that is distributed using RMI services and is deployed on a server. Server-side programs that execute on and add function to Web servers. Java servlets allow for the creation of complicated, high-performance, cross-platform Web applications. They are highly extensible and flexible, making it easy to expand from client or single-server applications to multitier applications. See Standardized Generalized Markup Language. A floating-point number that contains 32 bits. See also double precision. In IBM software products, an active form of help that guides you through common tasks. The tracking and control of software development. SCM tools typically offer version control and team programming features. Structured Query Language. A language used by database engines and servers for data acquisition and definition. See secure socket layer An ISO/ANSI/ECMA standard that specifies a way to annotate text documents with information about types of sections of a document. See class variable.

serialization server

server bean servlet

SGML single precision SmartGuide software configuration management (SCM) SQL SSL Standardized Generalized Markup Language static field

G-22 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP static method stored procedure See class method. A procedure that is part of a relational database. The Data Access Builder can generate Java code that accesses stored procedures. A communication path between a source of information and its destination. A class that inherits all the methods and variables of another class (its superclass). Its superclass might be a subclass of another class in the hierarchy. In VisualAge Developer Domain, a user that has purchased a Subscription for Java or received a subscription as part of VisualAge for Java, Enterprise Edition. In VisualAge Developer Domain, a paid access level to the Web site. Subscribing to the site entitles you to VisualAge for Java, Professional Edition and the Java Beans and tools, as well as access to all the information and trial downloads available with registration. See also registration, Subscription for Java, and Subscription for Java, CD-ROM version. A subscription level that includes the latest version of VisualAge for Java, Professional Edition, and an ever-increasing supply of JavaBeans and Java-related tools. The Subscription for Java also gives you access to new beans, tools, products, fixes, product updates, and Beta versions as they become available during the one-year subscription period. A subscription that includes set of VisualAge Developer Domain CD-ROMs three times a year, in addition to complete Web access. The CDs include the product code for your subscription level, as well as most of the current information on the Web site. You can view and search the CD information using any Web browser, just as you would on the Web (but with quicker response). See also subscription. See VADD JavaBeans and tools. See access level. A type that extends another type (its supertype).

stream subclass

subscriber

subscription

Subscription for Java

Subscription for Java, CD-ROM version

Subscription for Java toolkit subscription level subtype

Copyright IBM Corp. 2001, 2004

Appendix G. Glossary

G-23

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

superclass supertype Swing Set

A class that defines the methods and variables inherited by another class (its subclass). A type that is extended by another type (its subtype). A group of lightweight, ready-to-use components developed by JavaSoft. The components range from simple buttons to full-featured text areas to tree views and tabbed folders. This Java keyword specifies that only one thread can run inside a method at once.

synchronized

TCP/IP thin client

See Transmission Control Protocol based on IP. Thin client usually refers to a system that runs on a resource-constrained machine or that runs a small operating system. Thin clients don't require local system administration, and they execute Java applications delivered over the network. The third tier, or back end, is the hardware and software that provides database and transactional services. These back-end services are accessed through connectors between the middle-tier Web server and the third-tier server. Though this conceptual model depicts the second and third tier as two separate machines, the NCF model supports a logical three-tier implementation in which the software on the middle and third tier are on the same box. A separate flow of control within a program. (1) In a CICS program, an event that queries or modifies a database that resides on a CICS server. (2) In the Persistence Builder, a representation of a path of code execution. (3) The code activity necessary to manipulate a persistent object. For example, a bank application might have a transaction that updates a company account. This Java keyword specifies that a field is not included in the serial representation of an object. See serialization.

third tier

thread transaction

transient

G-24 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP Transmission Control Protocol based on IP type An Internet protocol that provides for the reliable delivery of streams of data from one host to another. In VisualAge for Java, a generic term for a class or interface

Uniform Resource Locator Unicode URL

The unique address that tells a browser how to find a specific Web page or file. A 16-bit international character set defined by ISO 10646. See also ASCII. See Uniform Resource Locator.

VADD VADD JavaBeans and tools variable

See VisualAge Developer Domain. A set of beans and bean tools provided with the VisualAge Subscription for Java, which use to be named the WebRunner toolkit. An identifier that represents a data item whose value can be changed while the program is running. The values of a variable are restricted to a certain data type. A software or hardware implementation of a central processing unit (CPU) that manages the resources of a machine and can run compiled code. See Java Virtual Machine. In the Visual Composition Editor, a bean that is visible to the end user in the graphical user interface. In VisualAge for Java, the tool you can use to create graphical user interfaces from prefabricated beans, and to define relationships (called connections) between beans. The Visual Composition Editor is a page in the class browser.

virtual machine

visual bean Visual Composition Editor

Copyright IBM Corp. 2001, 2004

Appendix G. Glossary

G-25

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

visual servlet VisualAge Developer Domain

A servlet that is designed to be built using the VisualAge for Java Visual Composition Editor. A Web site and Web-based subscription offering for VisualAge for Java, providing downloads and CDs of VisualAge for Java products. The Web site also provides a wealth of supporting components, tools, and how-to information to help programmers easily develop Java applications. An edition of VisualAge for Java that is designed for building enterprise Java applications, and has all of the Professional Edition features plus support for developers working in large teams, developing high-performance or heterogeneous applications, or needing to connect Java programs to existing enterprise systems. An edition of VisualAge for Java suitable for learning and building small projects of 500 classes or less. It is available as a no-charge download from VisualAge for Java and VisualAge Developer Domain Web sites. A complete Java development environment, including easy access to JDBC-enabled databases for building Java applications.

VisualAge for Java, Enterprise Edition

VisualAge for Java, Entry Edition

VisualAge for Java, Professional Edition

Web subscription WebRunner WebSphere

Synonym for Subscription for Java. See VADD JavaBeans and tools. WebSphere is the cornerstone of IBM's overall Web strategy, offering customers a comprehensive solution to build, deploy and manage e-business Web sites. The product line provides companies with an open, standards-based, Web server deployment platform and Web site development and management tools to help accelerate the process of moving to e-business. A permission level on Web servers specifying that files can be read by any user.

world readable files

G-26 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP World Wide Web A network of servers that contain programs and files. Many of the files contain hypertext links to other documents available through the network. In VisualAge for Java, the main window from which you can manage the workspace, create and modify code, and open browsers and other tools. The work area that contains the Java code that you are developing and the class libraries on which your code depends. Program elements must be added to the workspace from the repository before they can be modified. Code that provides an interface for one program to access the functionality of another program. See World Wide Web.

Workbench

workspace

wrapper WWW

Numerics

100% Pure Java

Sun Microsystems initiative to certify that applications and applets are purely Java-written.

Copyright IBM Corp. 2001, 2004

Appendix G. Glossary

G-27

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

G-28 Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Appendix H. Checkpoint Answers


Unit 2
1. Which of the following will reduce the coupling related to Electronic Information Exchange? (Select all that apply) a. Create messages that are context-free. b. Use system interfaces to hide implementation details. [CORRECT] c. Combine view information and data in each message. d. Use messages that are vendor-neutral and implementation-independent. [CORRECT] e. All of the above 2. Text based messages are preferred because: (Select all that apply) a. They are implementation-neutral. [CORRECT] b. All software technologies can read/write them. [CORRECT] c. It's easier to debug messaging problems. d. They can be spell checked. e. All of the above 3. In general, the properties that messages should exhibit are: (Select all that apply) a. Self-describing [CORRECT] b. Predictable structure [CORRECT] c. Conformance to industry wide standards d. All of the above

Unit 3
1. Basic XML can be described as: a. A hierarchical structure of tagged elements, attributes and text. [CORRECT] b. All the HTML tags plus a set of new XML only tags. c. Object-oriented structure of rows and columns. d. Processing instructions (PIs) for text data.

Copyright IBM Corp. 2001, 2004

Appendix H. Checkpoint Answers

H-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

e. Textual data with tags for visual presentation. 2. Which of these XML Fragments is not well formed? a. <root><class>XML</class></root> b. <class><root>XML</root></class> c. <root><class id=XML></root> [CORRECT] d. <root>XML<class id=XML/>XML</root> e. <root class=XML><class id root/>XML</root> 3. XML Comments are allowed (Select all that apply): a. Before the XML declaration b. Anywhere c. Between element tags [CORRECT] d. Before the root element [CORRECT] e. All of the Above 4. Which of these XML Elements with Attributes is invalid? a. <name first='Tony' LAST=Romeo /> b. <name name=Tony NAME=ROMEO /> c. <_name_ first-name=Tony last-name=Romeo /> d. <name=Tony Romeo /> [CORRECT] e. <name name=first='Tony' last='Romeo'" /> f. All of the Above 5. Which of these comments regarding HTML and XML is not true? a. HTML markup is focused on presentation. b. XML markup is based on defining the data. c. XML is based on HTML. [CORRECT] d. HTML tags are not case sensitive. e. XML tags are case sensitive. f. Both XML and HTML support attributes.

Unit 5
1. Which DTD entry correctly depicts a phone number, with optional area code? a. <!ELEMENT phone ( (areaCode)*, prefix, body ) > b. <!ELEMENT phone ( areaCode?, prefix, body ) > [CORRECT]
H-2 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

c. <!ELEMENT phone? ( areaCode, prefix, body ) > d. <!ELEMENT phone ( areaCode, (prefix, body)+ ) > 2. Which of the following is a limitation of DTD? a. Non-XML Syntax. b. Does not allow range of values (that is, 5 to 10 elements). c. Does not provide proper typing of values (that is, integer versus string). d. Does not permit Parameter Entity references. [CORRECT] e. All of the above. 3. Which DTD entry correctly depicts an optional attribute named type for a pet element, that defaults to the value "dog"? a. <!ATTLIST pet type CDATA #IMPLIED> b. <!ATTLIST type dog CDATA #FIXED "dog"> c. <!ATTLIST pet type CDATA "dog"> [CORRECT] d. <!ATTLIST pet (dog)? CDATA #REQUIRED>

Unit 6
1. Which is true of XML namespaces? a. They are stored in an internet based registry. b. They are associated with URIs. [CORRECT] c. They are integrated with DTDs. d. They are integrated with XML Schema. [CORRECT] 2. An XML namespace prefix (select all that apply): a. Links to a Schema definition. b. Is scoped to the element where it is defined. [CORRECT] c. Is short hand for a URI - CORRECT. d. Can stand for more than one namespace. [CORRECT] 3. Default namespaces apply to: a. Elements [CORRECT] b. Attributes c. Elements and attributes d. Neither elements nor attributes

Copyright IBM Corp. 2001, 2004

Appendix H. Checkpoint Answers

H-3

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

Unit 8
1. Which of the following are part of the XPath step syntax:? a. Predicate [CORRECT] b. AxisName [CORRECT] c. Ancestor d. Ceiling e. NodeTest [CORRECT] 2. The Axis shorthand notation of // indicates what? a. Ancestor b. Parent c. Ancestor-or-self d. Descendant-or-self [CORRECT] 3. Which XPath statement will return the number of questions on a test? a. count(/test/question) [CORRECT] b. /test/question/count() c. /test[count(question)] d. None of the above 4. The predicate function starts-with(XML is Great, XML) will return: a. XML b. true [CORRECT] c. is Great d. false e. XML is Great 5. The following XPath statement will result in -/news/story[@year='2001']/self::node()[contains(text,'IBM')]/

a. All 2001 news stories that contain IBM inside the text element. [CORRECT] b. All new stories with a year element = 2001 and a text element of IBM. c. Any news story with either IBM or 2001 in its text. d. All 2001 news stories that contain the letters IBM in any order. e. Error, as this is an invalid XPath statement.

H-4

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1.0.1
Student Notebook

AP

Unit 9
1. How can XML documents be transformed? a. XPATH b. XSLT [CORRECT] c. Notepad d. Xatran 2. Is XSL Stylesheet a XML document? a. Yes [CORRECT] b. No c. Depends on the header d. Only if it is applied to a XML document 3. What template would you use for extracting a specific value from the source tree? a. <xsl:choose... b. <xsl:copy... c. <xsl:value-of select=... [CORRECT] d. <xsl:text

Appendix A
1. How can an XML document be stored in an RDB? (Select all that apply.) a. In a Table column (CLOB) [CORRECT] b. SGML c. Decomposed into different columns/tables [CORRECT] d. Into a DTD file e. Compressed into an integer column 2. While RDBs are row-based, XML documents are: a. Record based b. Hierarchical [CORRECT] c. Obsolete d. Rectangular 3. I should use an RDB to store my XML if: a. I have lots of proprietary file formats

Copyright IBM Corp. 2001, 2004

Appendix H. Checkpoint Answers

H-5

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

Student Notebook

b. I need to retrieve large number of documents based on a specific element [CORRECT] c. I need to exchange data with a business partner d. I need to represent my data in Esperanto

H-6

Introduction to XML

Copyright IBM Corp. 2001, 2004


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

V3.1

backpg

Back page

You might also like