XM 301 Stud
XM 301 Stud
cover
Front cover
Student Notebook
ERC 4.1
Student Notebook
Trademarks IBM is a registered trademark of International Business Machines Corporation. The following are trademarks of International Business Machines Corporation in the United States, or other countries, or both: AFS AS/400 Database 2 DFS DRDA IMS Lotus NetRexx Open Blueprint RACF S/390 Tivoli Enterprise TME 10 VTAM AIX CICS DB2 Distributed Relational Database Architecture Encina Lotus Enterprise Integrator MQSeries Network Station OS/2 RDN SecureWay Tivoli Management Environment TXSeries WebSphere alphaWorks ClearCase DB2 Universal Database Domino Everyplace Lotus Notes MVS Notes OS/390 RS/6000 Tivoli TME VisualAge
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. SET and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC. Other company, product and service names may be trademarks or service marks of others.
V3.1
Student Notebook
TOC
Contents
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Course Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Agenda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv Unit 1. Introduction to XML and Related Technologies . . . . . . . . . . . . . . . . . . . . 1-1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 Course Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3 Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5 Course Objectives (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6 Course Objectives (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7 Agenda - Day 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8 Agenda - Day 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9 Agenda - Day 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11 Unit 2. Issues in Electronic Information Exchange . . . . . . . . . . . . . . . . . . . . . . . 2-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2 Electronic Information Exchange (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3 Electronic Information Exchange (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4 Intra-Application Information Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5 Agile Views - Multiple Client/Device Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6 Inter-Application Information Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7 Context-free Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8 B2B Intercompany Information Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9 Need to Establish Common Ground for Communication . . . . . . . . . . . . . . . . . . . 2-10 Inter-system Information Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11 Exchanging Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12 The Semantic Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13 A Common Solution? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14 Checkpoint Questions (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15 Checkpoint Questions (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-16 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17 Unit 3. What Is XML? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2 What Is XML? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3 Example Tree Representation of XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4 A Simple XML Document - Basic Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5 A Simple XML Document - Basic Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6 Basics of Well-formed XML (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7 Basics of Well-formed XML (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
Contents
iii
Student Notebook
Element Rules - Rule 1. Single Root Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9 Element Rules - Rule 2. Element Tag Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-10 Element Rules - Rule 3. Element Nesting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-11 Element Nesting Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-12 Element Rules - Rule 4. XML Naming Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-13 Rule 4... Tag Naming - Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-14 Rule 4... Element Content (1 of 2): General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-15 Rule 4... Element Content (2 of 2): Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-16 Rule 4... PCDATA - Parsed Character Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-17 Rule 4... CDATA - Character Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-18 Rule 4... CDATA Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-20 Element Rules - Rule 5. Element Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-21 Element Rules - Rule 6. XML Declaration (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . .3-22 Element Rules - Rule 6. XML Declaration (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . .3-23 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-24 Internationalization and Encoding (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-25 Internationalization and Encoding (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-26 Processing Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-27 Well-formed versus Valid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-28 HTML versus XML (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-29 HTML versus XML (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-30 HTML and XML Key Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-31 Checkpoint Questions (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-32 Checkpoint Questions (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-33 Checkpoint Questions (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-34 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-35 Unit 4. WebSphere Studio Application Developer Overview . . . . . . . . . . . . . . . 4-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-2 Roles-based Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-3 Development Environment Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-4 IBM WebSphere Studio Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-5 Family Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-6 WebSphere Studio Workbench . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-7 WebSphere Studio Workbench Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-8 WebSphere Studio Application Developer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-9 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-10 Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-11 Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-12 Editors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-13 Online Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-14 Cheat Sheets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-15 Application Developer Design Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16 Tooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-17 Java IDE (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-18 Java IDE (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-19 Java IDE (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-20 J2EE Tooling (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-21
iv Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1
Student Notebook
TOC
J2EE Tooling (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Portlet Tooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Tooling (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Tooling (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Web Tooling (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Web Tooling (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XML Tooling (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XML Tooling (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XML Tooling (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Performance/Trace Tooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Team Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Web Services Tooling (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Web Services Tooling (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Standards Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-22 4-23 4-24 4-25 4-26 4-27 4-28 4-29 4-30 4-31 4-32 4-33 4-34 4-35 4-36 4-37
Unit 5. Document Type Definition (DTD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2 Review: Well-Formed XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3 Why Do We Need DTDs? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4 What Is a DTD? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5 What is Allowed in a DTD? (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6 What is Allowed in a DTD? (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7 XML and DTD Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8 What Is Allowed. . .Declaring Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9 Element Content Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10 EMPTY Content Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11 ANY Content Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12 Elements Content Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13 Elements Content Examples (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14 Elements Content Examples (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15 Elements Content Examples (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-16 Mixed Content Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17 What Is Allowed. . .Declaring Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-18 Organizational Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-19 Attribute Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-20 Attribute Default Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-22 Attribute Default Declaration Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-23 Attribute Alternate Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-24 Attribute Types: Tokenized Types: IDREFS Example . . . . . . . . . . . . . . . . . . . . . . 5-25 Attribute Types: Tokenized Types: ENTITY Example . . . . . . . . . . . . . . . . . . . . . . 5-26 Attribute Types: Tokenized Types: ENTITIES Example . . . . . . . . . . . . . . . . . . . . 5-27 DTDs Part II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-28 Declaring ENTITYs: an Internal, Parsed ENTITYs Example . . . . . . . . . . . . . . . . . 5-29 Declaring ENTITYs: an External, Parsed ENTITYs Example . . . . . . . . . . . . . . . . 5-30 Unparsed Entity Declarations: a Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-32 Parameter ENTITYs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-33
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Contents
Student Notebook
Parameter ENTITYs - Another Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-34 What Is Allowed. . . Declaring Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-35 Joining a DTD to an XML Instance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-36 External DTD Subset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-37 Internal DTD Subset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-39 Split DTD Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-41 Whitespace and DTDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-42 Ignorable Whitespace Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-43 Validating versus Non-validating Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-44 Example DTDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-45 What's Wrong with DTDs? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-47 Status of DTDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-48 Tooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-49 Checkpoint Questions (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-50 Checkpoint Questions (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-51 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-52 Unit 6. XML Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-2 Problem: Element and Attribute Names can be Ambiguous . . . . . . . . . . . . . . . . . . .6-3 Elaboration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-4 Namespaces: The Big Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-5 XML Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-6 Qualified Names (QNames) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-7 Declaring Namespaces (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-8 Declaring Namespaces (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-9 Namespace Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-10 Default Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-11 Example - Default Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-12 Documents with Multiple Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-13 Elements with No Namespace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-14 Attributes and Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-15 Namespace Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-16 Example: Use of Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-17 Problems with Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-18 Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-19 Status of Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-21 More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-22 Checkpoint Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-23 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-24 Unit 7. XML Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-3 What Is an XML Schema? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-4 Why Do We Need XML Schema? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-6 DTD versus XML Schema (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-8 DTD versus XML Schema (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-9
vi Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1
Student Notebook
TOC
Requirements Applied to the XSD Language (1 of 3) . . . . . . . . . . . . . . . . . . . . . . Requirements Applied to the XSD Language (2 of 3) . . . . . . . . . . . . . . . . . . . . . . Requirements Applied to the XSD Language (3 of 3) . . . . . . . . . . . . . . . . . . . . . . Anatomy of an XML Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - Return to Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - Basic Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - Basic Schema Concepts . . . . . . . . . . . . . . . . . . . . . . Simple Types Built-in to XML Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - How Studio Sees It . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 1 (1 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 1 (2 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 1 (3 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 1 (4 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 1 (5 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 1 (6 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 1 (7 of 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 2 (1 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 2 (2 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 2 (3 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 2 (4 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 3 (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 3 (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 4 (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - XSD Part 4 (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . A Simple XML Document - Connecting the Schema to the Instance . . . . . . . . . . What's Next? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .but first. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XML Schema Part II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Before We Begin: Some Notes about Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . Complex Type Definitions, Element and Attribute Declarations . . . . . . . . . . . . . . Parts of XSD Speech (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Parts of XSD Speech (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Resetting Expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Simple Type (simpleType) Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . All the Built-in Simple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Creating New Simple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Facets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example: Constraining Facets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example: Enumeration Facet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . simpleContent and Empty Complex Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . simpleContent Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Complex Type Definition (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Complex Type Definition (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Named versus Anonymous Types (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Named versus Anonymous Types (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Declaring Child Elements in complexType Elements . . . . . . . . . . . . . . . . . . . . . . Element Declaration: Common Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . minOccurs and maxOccurs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
7-10 7-12 7-13 7-14 7-15 7-16 7-17 7-18 7-19 7-20 7-22 7-23 7-24 7-25 7-26 7-27 7-28 7-29 7-30 7-31 7-32 7-33 7-34 7-35 7-36 7-38 7-39 7-40 7-41 7-42 7-43 7-44 7-45 7-46 7-47 7-49 7-50 7-52 7-53 7-54 7-55 7-56 7-57 7-58 7-59 7-60 7-61 7-62
vii
Contents
Student Notebook
Example: minOccurs and maxOccurs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-63 1.4 Attribute Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-64 Declaring Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-65 Example: Attribute Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-66 Example: An Element with Attributes (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-67 Example: An Element with Attributes (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-68 2.1 Attribute Group Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-69 Anonymous Types in Attribute Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-70 2.1 Attribute Group Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-71 Attribute Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-72 2.3 Model Group Definitions (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-73 2.3 Model Group Definitions (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-74 2.4 Notation Declarations (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-75 2.4 Notation Declarations (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-76 3.1 Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-77 3.2 Model Groups (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-78 3.2 Model Groups (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-79 Example: Compositors (Model Groups) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-80 Model Groups and Compositors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-82 Example: Global Definitions and Declarations . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-84 3.5 Attribute Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-85 Part III Associating a .xsd with a .xml . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-86 . . .but first. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-87 Namespaces, Schemas and Qualification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-88 Namespaces, Schemas and Qualification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-89 Putting a Schema in a Namespace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-90 XML Schemas and Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-91 Target Namespace and Schema Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-92 Finding the Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-93 Best Practices (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-95 Best Practices (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-97 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-98 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-99 Unit 8. XPath - XML Path Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-2 What Is XPath? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-3 Why Is It Called XPath? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-4 Example Tree Representation of XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-5 XPath Expression Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-6 XPath Current Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-7 XPath Step Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-8 XPath Address Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-9 Example: Absolute Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-10 Example: Absolute Addressing with Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . .8-11 Relative Addressing using Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-12 Example: Relative Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-13 XPath - The Thirteen Axes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-15
viii Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1
Student Notebook
TOC
Abbreviated Step Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XPath - Partitioning the Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example: Addressing with Axes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XPath Axis Node Type and Node Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sample Node Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XPath - Predicates (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XPath - Predicates (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predicate Core Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predicate String Functions (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predicate String Functions (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Predicate Number and Boolean Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reference Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Checkpoint Questions (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Checkpoint Questions (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Checkpoint Questions (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8-16 8-17 8-18 8-19 8-20 8-21 8-22 8-23 8-24 8-25 8-26 8-27 8-28 8-29 8-30 8-31
Unit 9. eXtensible Stylesheet Language: Transformations (XSLT) . . . . . . . . . . . 9-1 Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2 Why Do We Need XSL Transformations? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3 Why Do We Need XSL? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4 XSL: Three Parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-5 XSLT Language Characteristics (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-7 XSLT Language Characteristics (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-9 XSLT Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-10 XSL Transformations (XSLT) Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-11 The XSLT Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-13 Anatomy of a Stylesheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-15 Elements to Generate Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-16 <xsl:stylesheet Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-17 XSL Optional Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-18 <xsl:template Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-19 <xsl:apply-templates Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-20 Pattern Matching (XPath) Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-21 Default of <xsl:apply-templates /> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-23 <xsl:value-of Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-24 Control Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-25 XML Input As a Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-26 Desired HTML Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-27 XML to HTML (1 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-28 XML to HTML (2 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-29 XML to HTML (3 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-30 XML to HTML (4 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-31 XML to HTML (5 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-32 Calling <xsl:apply-templates/> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-33 Named Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-34 <xsl:for-each Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-35 Time for a Lab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-36
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Contents
ix
Student Notebook
<xsl:if Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-37 <xsl:choose Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-38 <xsl:choose Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-39 Elements to Generate Output (XML to XML) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-40 <xsl:element Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-42 <xsl:attribute> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-43 XML to XML Example (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-44 XML to XML Example (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-45 Numbers, Sorting, and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-46 Working with Numbering in XSLT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-47 <xsl:number Element format Attribute Values . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-49 <xsl:number Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-50 <xsl:sort Element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-51 <xsl:sort Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-52 Sort Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-53 XPath/XSLT Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-54 Other Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-56 Attribute Value Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-57 Attribute Value Templates Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-58 XSLT Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-59 Xalan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-60 XSL Resources from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-61 XSL References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-62 Checkpoint Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-63 Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-64 Appendix A. Introduction to Databases and XML . . . . . . . . . . . . . . . . . . . . . . . . . A-1 Appendix B. Additional Information for XML Schema . . . . . . . . . . . . . . . . . . . . . . B-1 Appendix C. Whats New in WebSphere Studio V5.1.1 . . . . . . . . . . . . . . . . . . . . . C-1 Appendix D. Additional Information and Examples . . . . . . . . . . . . . . . . . . . . . . . . D-1 Appendix E. Bibliography and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-1 Appendix F. Acronyms and Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-1 Appendix G. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G-1 Appendix H. Checkpoint Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H-1
Introduction to XML
V3.1
Student Notebook
TMK
Trademarks
The reader should recognize that the following terms, which appear in the content of this training document, are official trademarks of IBM or other companies: IBM is a registered trademark of International Business Machines Corporation. The following are trademarks of International Business Machines Corporation in the United States, or other countries, or both: AFS AS/400 Database 2 DFS DRDA IMS Lotus NetRexx Open Blueprint RACF S/390 Tivoli Enterprise TME 10 VTAM AIX CICS DB2 Distributed Relational Database Architecture Encina Lotus Enterprise Integrator MQSeries Network Station OS/2 RDN SecureWay Tivoli Management Environment TXSeries WebSphere alphaWorks ClearCase DB2 Universal Database Domino Everyplace Lotus Notes MVS Notes OS/390 RS/6000 Tivoli TME VisualAge
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. SET and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC. Other company, product and service names may be trademarks or service marks of others.
Trademarks
xi
Student Notebook
xii
Introduction to XML
V3.1
Student Notebook
pref
Course Description
Introduction to XML and Related Technologies Duration: 2.5 days Purpose
This course provides an introduction to XML (eXtensive Markup Language) and related technologies. Students will gain conceptual and practical knowledge of the concepts that are required to work with XML. The course will build the basic skills to enable architects, designers, analysts, developers, testers, and administrators to use XML and its related technologies in the context of building e-business applications. The course is a 2.5-day classroom course with hands-on lab exercises that reinforce the lecture material.
Audience
This course is designed for information technology individuals, including enterprise application architects, designers, developers, and content modelers and creators.
Prerequisites
Knowledge of Internet technologies is required. Some experience with using HTML would be helpful, but is not necessary.
Objectives
After completing this course, you should be able to: Describe the important XML standards and recommend their use in business applications Define XML documents using namespaces, DTD, or Schema Develop and test XML processing applications Use XSLT to transform XML documents as necessary Identify open areas in XML, such as security, and emerging technologies such as DB support, XHTML, Web Services, XLink, and so forth. Plan for their incorporation into XML processing applications Identify where XML fits in application architectures
Course Description
xiii
Student Notebook
xiv
Introduction to XML
V3.1
Student Notebook
pref
Agenda
Day 1
Unit 1 - Introduction to XML and Related Technologies Unit 2 - Issues in Electronic Information Exchange Unit 3 - What Is XML? XML Basics Lab Unit 4 - WebSphere Studio Application Developer Overview Introduction to WebSphere Studio Application Developer Lab Unit 5 - Document Type Definition (DTD) DTD Lab Unit 6 - XML Namespaces XML Namespaces Lab
Day 2
Unit 7 -XML Schema XML Schema Lab Unit 8 - XPath - XML Path Language XPath Lab Unit 9 - XSL - eXtensible Stylesheet Language Part 1 XSLT Lab Part 1 - Simple Transforms
Day 3
Unit 9 - XSL - Extensible Stylesheet Language Part 2 XSLT Lab Part 2 - Conditional Transforms Introduction to Databases and XML (Optional Unit)
Agenda
xv
Student Notebook
xvi
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
1-1
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Introduction
XM301 Introduction to XML and Related Technologies Instructor:
XM3014.1
Notes:
1-2
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
Course Description
This course is designed to introduce students to the fundamentals of XML and its significant derivative companion technologies: XML Schema, Namespaces, XPath, and XSL Transformations. Document Type Declarations (DTDs) are also introduced. The focus of the course is on the creation, specification and processing of XML documents. The course is 2.5 days in length and provides extensive hands-on labs throughout. It is expected that additional, after class work (see notes below) will be required to adequately understand the material we will introduce.
XM3014.1
Notes:
1-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Audience
The course is targeted to Information Technology professionals involved in the exchange of information using XML as the data transport mechanism.
XM3014.1
Notes:
1-4
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
Prerequisites
Prerequisites: There are no specific prerequisites for this course. However, some familiarity with markup languages is recommended.
XM3014.1
Notes:
1-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Course Objectives (1 of 2)
After completing this course, you should be able to: Describe/differentiate the use of HTML and XML Enumerate the rules of a well-formed XML document Create and maintain XML documents Describe the purpose and use of Document Type Definitions (DTDs) Create DTDs describing the validation rules for specific XML instances* Describe the purpose and use of XML Schema Enumerate the benefits of XML Schema over DTDs Create XML Schemas describing the validation rules for specific XML instances* *...using IBM WebSphere Studio Application Developer
XM3014.1
Notes:
1-6
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
Course Objectives (2 of 2)
After completing this course, you should be able to: Describe the purpose of XML Namespaces Declare and use XML Namespaces in an XML document* Describe the use of an XPath in the context of XSLT and XML Schema Create XPath expressions that locate specific information in an XML instance* Describe the use of XSL in the processing of XML documents Create an XSL Transformation to transform an XML document into some other instance* *...using IBM WebSphere Studio Application Developer
XM3014.1
Notes:
1-7
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Agenda - Day 1
Welcome and Introductions Issues in Information Exchange What is XML? Lab Exercise Overview of IBM WebSphere Studio Application Developer Lab Exercise Document Type Definitions Lab Exercise XML Namespaces Lab Exercise
XM3014.1
Notes:
1-8
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
Agenda - Day 2
XML Schema Lab Exercise XPath Lab Exercise XSL Transformation - Part 1 Lab Exercise
XM3014.1
Notes:
1-9
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Agenda - Day 3
XSL Transformation - Part 2 Lab Exercise
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Unit Summary
We've looked at the overall course objectives and a day-by-day agenda. Let's get started.
XM3014.1
Notes:
1-11
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
V3.1.0.1
Student Notebook
Uempty
2-1
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Unit Objectives
After completing this unit, you should be able to: Describe the types of information exchange that occur in modern computer systems Describe information exchange issues that exist in modern computer systems Describe what is needed to address many of the issues that exist in information exchange
XM3014.1
Notes:
2-2
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
2-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Company 2
Company 3
Company n
XM3014.1
Notes:
2-4
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
In this context, the biggest Information Exchange issues occur with the View
XM3014.1
Notes:
2-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
2-6
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
The way that applications communicate should not make assumptions about implementation technology or how information will be used
XM3014.1
Notes:
2-7
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Context-free Communication
As much as possible, eliminate assumptions from the way in which information is exchanged. This means that the information that flows between applications should not be coupled to a particular technology or to an assumption about how it will be used. When possible, send an application domain entity, for example, a Purchase Order rather than the individual pieces, for example, a total, an item description, and so forth. Don't use a message that is bound to an implementation technology, for example, a Serialized Java Object (a Java-specific bit stream). Ideally, the communication medium would be based on simple, ubiquitous technology, for example, straight text. Should be structured and self-describing to eliminate the need for context awareness in the receiver.
Requires a structured information (text) format that supports the expression of semantics
Copyright IBM Corporation 2004
XM3014.1
Notes:
2-8
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
C2 C1 C3 C1 M
C2 C3
Cn
Scenario 1 Communicate directly with business partners, potential for 'n' communication protocols
Cn
Scenario 2 Communicate with business partners through an intermediate 'Marketplace' vendor. Forced to evolve at the rate of the intermediary
XM3014.1
Notes:
2-9
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
System1
System2
XM3014.1
Notes:
2-11
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Exchanging Messages
Exchanging messages between systems has a lot in common with exchanging messages between B2B business partners. The exception is that though inter-system information exchange requires an established protocol (the interface), the system does not necessarily benefit from that protocol being an accepted standard for B2B communication
There are other differences, for example, the likely use of Message Oriented Middleware in system integration (MOM), but this presentation is focused on the information being exchanged not on the exchange mechanism.
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
2-13
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
A Common Solution?
Collecting all the observations together, an information solution addressing each of these issues would be: a. A view-independent, structured information stream. b. A structured information (text) format that supports the expression of semantics. c. An implementation-independent, vendor-neutral markup language for describing information, enabling the creation of domain-specific business languages. d. Self-describing, decoupled from view details. In short: "A text-based, vendor-neutral markup language that supports the expression of semantics." Such a language would address many information exchange issues if it gained significant industry acceptance.
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Checkpoint Questions (1 of 2)
1. Which of the following will reduce the coupling related to Electronic Information Exchange? (Select all that apply) a. Create messages that are context-free. b. Use system interfaces to hide implementation details. c. Combine view information and data in each message. d. Use messages that are vendor-neutral and implementation-independent. e. All of the above. 2. Text-based messages are preferred because: (Select all that apply) a. They are implementation-neutral. b. All software technologies can read/write them. c. It's easier to debug messaging problems. d. They can be spell checked. e. All of the above.
XM3014.1
Notes:
2-15
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Checkpoint Questions (2 of 2)
3. In general, the properties a message should exhibit are: (Select all that apply) a. Self-describing b. Predictable structure c. Conformance to one, industry-wide standard d. All of the above
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Unit Summary
Having completed this unit, you should be able to: Describe the types of information exchange that occur in modern computer systems Describe the information exchange issues that exist in modern computer systems Describe what is needed to address many of the issues that exist in information exchange
XM3014.1
Notes:
2-17
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
V3.1.0.1
Student Notebook
Uempty
3-1
Student Notebook
Unit Objectives
After completing this unit, you should be able to: Describe the basic rules of XML Describe what it means for an XML document to be well-formed List the components that make up an XML document Differentiate between XML and HTML Describe the internationalization support in XML Define some best practices for XML
XM3014.1
Notes:
Although XML is a stable and mature, the supporting technologies are evolving rapidly. Keep up with the changes at: http://www.w3.org/TR.
3-2
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
What Is XML?
At its core XML is text formatted to follow a well-defined set of rules. XML documents consist primarily of tags and text. If you've ever seen the source to an HTML document, then the XML structure should look familiar This text may be stored/represented in: A normal file stored on disk A message being sent over HTTP A character string in a programming language A CLOB (character large object) in a database Any other way textual data can be used XML documents do not need to exist as documents --they may be: Byte streams sent between applications Fields in a database record Collections of XML Infoset information items For simplicity they will be referred to as though they are documents and files.
Copyright IBM Corporation 2004
XM3014.1
Notes:
Usually people will talk about this 'XML' and that 'XML' or this 'XML file' and what they are really referring to is XML markup text encapsulating specific data. As long as XML text or definitions follow the syntax set of rules, any data can be represented.
3-3
Student Notebook
<?xml version="1.0"?> <book> <author> Tom Wolfe </author> <title> The Right Stuff </title> <price> $6.00 </price> </book>
ROOT <book>
<author>
<title>
<price>
"Tom Wolfe"
"$6.00"
XM3014.1
Notes:
This example shows a typical XML document and how it is represented as a tree of nodes. This conceptual depiction of XML is important to understand. book is the root element but ROOT is the highest point in the tree or hierarchy: think of ROOT as the location of a pointer used to keep track of where you are.
3-4
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
Comment
XM3014.1
Notes:
Textual data between tags is also be referred to as content. Tagged elements of any sort are also known as markup. Sometimes the term body is used to refer to anything between a start tag and an end tag.
3-5
Student Notebook
XM3014.1
Notes:
These definitions will be important when we discuss the XML Schema definition language in a later chapter. We introduce these terms here in preparation for their use then.
3-6
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
3. Elements must be properly nested underneath a parent tag (except for the single, root element):
A nested tag-pair may not overlap another tag There is no limit to the nesting level of children elements
XM3014.1
Notes:
As you can see, creating an XML instance will be a rather straightforward task.
3-7
Student Notebook
5. Attributes, extra information that can be provided for elements, must be properly quoted:
That is, all attribute values must be in quotes.
6. The first line should/must contain the special tag that identifies the version of the XML specification to apply:
XML 1.0 is currently the most common.
XM3014.1
Notes:
Version 1.1 is about to emerge. Many of the current XML instances lack this declaration. It is often useful to identify the processing instructions, of which the XML declaration is but one, as the prolog; the actual XML instance material, that between the root element open and closing tags, may then be referred to as the XML document.
3-8
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
Legal:
<?xml version="1.0"?> <colors> <color>red</color> <color>green</color> </colors>
Not legal:
<?xml version="1.0"?> <color>red</color> <color>green</color>
XM3014.1
Notes:
XML is a Mark Up language. Tags form the basis of all mark up languages. The purpose of an Element tag is to identify the contents of the data and children tags held within them. The root element should have a name that provides a good definition of all the data contained in the document. The first physical line in this sample is there because of Rule 6, which we shall cover later.
3-9
Student Notebook
XM3014.1
Notes:
The empty element notation (< ... />) is unique to XML. The W3C is currently updating the SGML recommendation to include this syntax. Empty elements are practical and common when the only associated information is enclosed within the element's attributes. For Empty Element tags, a space is required before the tags terminator (" />").
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
There is no limit to the depth of children in XML, but an overly large number may indicate a poor design. If an XML document does not have an associated DTD or Schema, then all whitespace is retained since a processor does not know if it is considered textual data or just for aesthetics. DTDs and Schemas are covered in later sections.
3-11
Student Notebook
Not legal:
<?xml version="1.0"?> <shirt> <style> <size>large <color>red Polo </style> </size></color> </shirt> The element tags are mixed up and not ordered.
Best Practice: Use indentation to represent the document's hierarchy. Important if your document will likely be read by humans. Computers and programs don't usually care.
Copyright IBM Corporation 2004
XM3014.1
Notes:
Indentation and other whitespace is only for human readability, but adds "fat" to a documents size and processing requirements. This is only an issue with huge XML documents. It is important to realize that an XML instance is treated by its processor/parser as one, continuous stream of characters, some of which are recognized by the parser as "special." As a consequence, when the parser reports an error its location is where the parser gave up, which may be far beyond where the actual error occurred.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
Elements may not use W3C reserved Namespace prefix or the letters "XML" in any case. Element names may not include words reserved by the XML specification. These include: DOCTYPE ELEMENT ATTLIST ENTITY
Colons (":"), while technically legal in tag names, should not be used as they are reserved for use with Namespaces.
3-13
Student Notebook
Not Legal
1name, -street, &name <color> red </COLOR> <SIZE> small <SiZe> <f name> John </f name> <xmlName> John </xmlName>
Copyright IBM Corporation 2004
Comments
Examples of legal and illegal element names. Element names are case sensitive and start and end tags must match.
Element names must not contain spaces. Elements must not contain any W3C reserved words.
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
PCDATA is parsed character data. A "snippet" is a piece of a larger, legitimate XML file.
3-15
Student Notebook
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
XML differentiates between markup characters and text characters by providing special XML escape characters to be used in XML PCDATA. Only regular parsed character data is allowed inside the attributes value. Any special characters such as ">" and "&" must always be represented as escape characters. The others may appear non-escaped in some places in XML, but it is best to just use the escape characters all the time. These escape characters are independent of the encoding chosen.
3-17
Student Notebook
XM3014.1
Notes:
The 5 XML escape characters will not be interpreted (that is, changed to the non-escaped character) in CDATA sections, so they should not be used. If you put < in the CDATA, you will see < in the out put not ">". So use the actual characters. Encoding refers to the character set for the entire document, so it does apply to CDATA as well. CDATA sections cannot be nested. CDATA will retain spaces. While XML escape characters are not to used in CDATA, you must be aware of how the 'down-line' applications of the XML will use the CDATA. Common usage: JavaScript in the XML and specialized HTML Browser may have problems with some special characters which must then be represented in hex. example: micro sign () = µ
3-18 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
Uempty
example: non-breaking space =   example: ampersand (7) = & Link to special HTML characters: http://www.owlnet.rice.edu/~jwmitch/iso8859-1.html
3-19
Student Notebook
XM3014.1
Notes:
Both 'script' element examples are valid. Which one you would use would depend on the behavior of the application/browser which will use the transformed XML and therefore the CDATA. This topic is important to XSLT processing.
V3.1.0.1
Student Notebook
Uempty
Notice the different usage of the attribute "type" in the two elements; semantically they are not the same. Attributes must have a value. Values must be quoted with either double or single quotes. Convention is to stick with one or the other.
XM3014.1
Notes:
Attribute naming follows the same rules as element naming. An element may contain zero or more attributes within its start tag. Attributes provide extra information to the meaning of the element. This may include "key" information or other identifying details. Name collisions are common in XML as shown in the attributes of the first example. Using Namespaces resolves these sort of issues. You cannot use the same style quote in the value of the attribute, that is, style="monty's" is valid, style='monty's' is invalid.
3-21
Student Notebook
If this declaration is used, the version attribute is mandatory. The encoding attribute indicates the character encoding used in the document; if UTF-8 or UTF-16 is used it may be omitted. ASCII is a subset of UTF-8 and need not be declared. Comments are not allowed before this statement. The XML Declaration follows the syntax of a Processing Instruction or PI, which is described on a subsequent chart, but it is considered to be unique and is treated separately in the 1.0 XML specification. GENERAL NOTE OF CAUTION: You can not always rely on a browser or tool to completely/correctly enforce the specifications. Nor are the specifications always written in language that, to a particular reader, is unambiguous. Still, the best advice is when in doubt, refer to the specification, which for XML is www.w3.org/XML.
Copyright IBM Corporation 2004
XM3014.1
Notes:
All XML documents should begin with this tag, and it MUST be at the first position of the file (that is, no blank lines or comments or spaces before the tag). The current version of all XML documents is "1.0" and must appear within the "<?XML" tag if that tag is used. It indicates the version of XML to which the Document Entity must conform. "stand-alone" is included here for completeness: it is automatically set to the correct value if it is not used; most users do not include it. We will have more to say on this in our discussions of the grammars we can apply to XML instances. "Yes" means the document that follows can stand alone; that is, without requiring a grammar document to complete its information.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
The last point may be problematic if, say, the associated DTD file is not readily available for inspection. You will see in later sections that we can override the attribute values in our XML instance from within a DTD or XML Schema file. This may not appear to be a problem at the outset, but over time we may forget that we are overriding some values. As XML instances grow in length and complexity this may become a serious source of confusion. A best practice is to design the XML instance data to contain ALL the data so that, from an internal data perspective, it does stand alone.
3-23
Student Notebook
Comments
<!---> Defines a comment.
A space after the beginning and before the trailing hyphens is recommended but not required. <?xml version="1.0"?> <!-- This is a comment. They can go anywhere inside an XML document except within an element tag. --> <book> <chapter>A is the first letter</chapter> <!-- Here is another comment. --> <chapter>Z is the last letter</chapter> </book> Improper usage: <chapter <!-- comment -->>Some text.</chapter> ...or before the XML Declaration statement.
Copyright IBM Corporation 2004
XM3014.1
Notes:
Comments can go anywhere in the XML except: Before the XML Declaration Inside the actual element tags Comments are a good thing. Use them just as would in a program.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
A good way to test that the encoding is correct is by viewing the XML file in IE 5.0 or later. There are two error messages you may receive from IE or from a parser: 1. An invalid character was found in text content. You will get this error message if a character in the document does not match the encoding attribute. 2. Switch from current encoding to specified encoding not supported. You will get this error message if there is a disconnect between the encoding used in saving and specification of the encoding. The common problem is that it has been saved as a single-byte encoding and the encoding attribute specifies a double-byte or visa versa.
3-25
Student Notebook
XM3014.1
Notes:
A good way to test that the encoding is correct is by viewing the XML file in IE 5.0. There are two error Messages you may receive from IE or from a parser: 1. An invalid character was found in text content. You will get this error message if a character in the document does not match the encoding attribute. 2. Switch from current encoding to specified encoding not supported. You will get this error message if your file there is a disconnect between the saving and specification of the encoding. The common problem is that is has been saved as a single-byte encoding and the encoding attribute specifies a double-byte or visa versa.
V3.1.0.1
Student Notebook
Uempty
Processing Instruction
Syntax <? target arg*?> Processing Instruction is often abbreviated as PI in documentation. A feature inherited from SGML. Used to embed application-specific instructions in documents. The target name immediately follows "<?" and is used to associate the PI with an application. May include zero or more arguments. May be preceded by comments.
For example, <?xml-stylesheet href="common.css" type="text/css"?>, which is a generally available stylesheet for simple formatting.
XM3014.1
Notes:
If a comment is inserted between the XML Declaration and a PI such as the one shown, Studio will not consider it an error. A demo file is available in the XM301 Lectures folder, Unit 3. This PI, although useful, does NOT define a grammar for the XML document in which it is used: we will talk about grammars in subsequent chapters. To reemphasize: the XML Declaration, while it may look like a PI, is treated as special!
3-27
Student Notebook
XM3014.1
Notes:
All XML parsers must check XML documents for being well formed. XML parsers are classified as being validating, or non-validating.
V3.1.0.1
Student Notebook
Uempty
<course> <name>Java Programming</name> <department>EECS</department> <teacher> <name>Paul Thompson</name> </teacher> <student> <name>Ron Jones</name> </student> <student> <name>Uma Abingdon</name> </student> <student> <name>Lindsay Garmon</name> </student> </course>
XM3014.1
Notes:
All markup tags in HTML are directed at visual composition. No consideration is given to the actual semantics of the data. XML markup tags are based solely on the data content. Clean separation of data and presentation
3-29
Student Notebook
XML
<?xml version="1.0"?> <course> <name>Java Programming</name> <department>EECS</department> <teacher> <name>Paul Thompson</name> </teacher> <student> <name>Ron Jones</name> </student> <student> <name>Uma Abingdon</name> </student> <student> <name>Lindsay Garmon</name> </student> </course>
XM3014.1
Notes:
These two source listings really show fundamental differences between HTML and XML. While both contain text marked up by tags, their meaning is entirely different. Which would you rather parse and insert into a database?
V3.1.0.1
Student Notebook
Uempty
XML
Defines its own tags to identify data. Requires matching end tags.
<name>test</name>
Browsers will almost always do a "best guess" on ill-formed HTML. Does not support empty elements, but allows single start tags. <br> and <hr> Is not case sensitive.
<TABLE> ... </table>
XML Parsers will generate a fatal exception for well-formedness violations. Provides for empty elements.
<device type="radio" />
is valid
Is case sensitive.
XM3014.1
Notes:
HTML has a fixed tag set. In XML there is no predefined tag set. The allowed tags in an XML document are defined in its DTD or Schema. XHTML is an effort to correct the sins of HTML's past. It is a new XML technology that consists of an HTML specific DTD that defines the valid HTML tags. Unfortunately, many of today's browsers will not recognize XHTML documents properly!
3-31
Student Notebook
Checkpoint Questions (1 of 3)
1. Basic XML can be described as: A. A hierarchical structure of tagged elements, attributes and text. B. All the HTML tags plus a set of new XML only tags. C. Object-oriented structure of rows and columns. D. Processing instructions (PIs) for text data. E. Textual data with tags for visual presentation. 2. Which of these XML fragments is not well-formed? A. <root><class>XML</class></root> B. <class><root>XML</root></class> C. <root><class id="XML"></root> D. <root>XML<class id="XML"/>XML</root> E. <root class="XML"><class id="root"/>XML</root>
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Checkpoint Questions (2 of 3)
3. XML Comments are allowed (Select all that apply): A. Before the XML Declaration B. Anywhere C. Between element tags D. Before the root element E. All of the Above 4. Which of these XML elements with attributes is not well-formed? A. <name first='Tony' LAST="Romeo" /> B. <name name="Tony" NAME="ROMEO" /> C. <_name_ first-name="Tony" last-name="Romeo"/> D. <name="Tony Romeo" /> E. <name name="first='Tony' last='Romeo'" /> F. All of the Above
XM3014.1
Notes:
3-33
Student Notebook
Checkpoint Questions (3 of 3)
5. Which of these comments regarding HTML and XML is not true? A. HTML markup is focused on presentation. B. XML markup is based on defining the data. C. XML is based on HTML. D. HTML tags are not case sensitive. E. XML tags are case sensitive. F. Both XML and HTML support attributes.
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Unit Summary
Having completed this unit, you should be able to: Describe the basic rules of XML Describe what it means for an XML document to be well-formed List the components that make up an XML document Describe the differences between XML and HTML Describe the internationalization support in XML Describe some best practices in XML
XM3014.1
Notes:
The status of various XML technologies (W3C Activities) can be found at: http://www.w3.org/TR.
3-35
Student Notebook
V3.1.0.1
Student Notebook
Uempty
References
WebSphere Studio Application Developer Help Perspective
4-1
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Unit Objectives
After completing this unit, you should be able to: Describe role-based development Describe the WebSphere Studio family of tools State the role of WebSphere Studio Workbench in the WebSphere Studio tools Describe basic features of WebSphere Studio Application Developer Describe the major sets of tooling provided by WebSphere Studio Application Developer
XM3014.1
Notes:
4-2
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
Roles-based Development
Developing Web Applications requires more than just writing Java code
Role
Enterprise Integrator
Connection Data
Bean Provider
Business Logic Data
Application Assembler
Application Flow
Page Producer
Page Layout and Content
Web Master
Operational Environment
Workarea
Products
JavaBeans EJBs
JavaBeans EJBs
Tool
XM3014.1
Notes:
There are four distinct development roles shown here: Enterprise Integrator Bean Provider Application Assembler Page Producer Tooling needs to support each of these roles and permit easy management and integration of the developed assets.
4-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Provide a role-based development model where the assets are the focus, not the tool Provide a common repository solution for all assets and tools Provide rapid support for new standards and technologies
For example, Web Services
XM3014.1
Notes:
The development environment should support the tasks performed by the developers. It should be configurable and customizable for each individual developer. Tools need to accommodate the rapid change in available technologies.
4-4
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
The IBM WebSphere Studio family is applied to a development platform (as opposed to a set of development tools).
4-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Family Contents
WebSphere Studio Products (V5) : WebSphere Studio Application Developer (includes all of Site Developer functionality
Focused on development of Web Services, JSPs, Servlets, XML and J2EE and database applications in a team environment
XM3014.1
Notes:
The flagship products in the WebSphere Studio brand (Version 5) are: WebSphere Studio Application Developer WebSphere Studio Enterprise Developer
4-6
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
The Workbench is not a tool, that is, it is not in itself a product that is for sale. It is an open and portable tool platform providing an integration technology. The Workbench can be thought of as a set of Java frameworks and a set of development tools geared for tool builders.
4-7
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Easy construction and deployment platform for tools Open access to source code and tool provider community
XM3014.1
Notes:
The Workbench offers its greatest support for tool builders; making it easy to add plug-ins (tools) to the overall IDE. This allows quick "time-to-market" of tools supporting emerging technologies. The underlying framework which adds to the tool builders productivity gives end-users a common look and feel.
4-8
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
Start the WebSphere Studio Application Developer Start -> Programs -> IBM WebSphere Studio -> Application Developer 5.1 Workbench opens when you launch Application Developer Within the workbench -- open the perspectives, views, and editors
Copyright IBM Corporation 2004
XM3014.1
Notes:
4-9
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Terminology
Shortcut Bar
Editor
Outline Pane
Task Sheet
Views
XM3014.1
Notes:
The workbench window displays one or more perspectives that contain views and editors. You can quickly switch between perspectives and views using the shortcut buttons which appear on the shortcut bar.
V3.1.0.1
Student Notebook
Uempty
Perspectives
A group of related views and editors To open a Perspective: Select via Window -> Open Perspective Some Perspectives: Java: to develop and test Java programs Server: to configure, run, and manage test servers Debug: to control debug flow, see variables, and so forth
XM3014.1
Notes:
4-11
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Views
A view displays specialized information. For example: Bookmarks view displays all bookmarks in workbench. A view might appear alone in a single pane, or several views might be stacked within a single tabbed pane. Views can be undocked/docked from the main workbench window. Information updates on a view are saved immediately. View toolbars apply only to the particular view in which they appear.
XM3014.1
Notes:
Views support editors and provide alternative presentations or navigation of the information in your workbench. For example, the Navigator displays projects and other resources you are working with. A view might appear by itself, or stacked with other views in a tabbed notebook. On Windows platforms, views can be undocked from the main workbench window and appear as floating windows on the desktop. Undocked views can also be docked back into the main workbench window. More info on the Application Developer menu: Help --> Navigating Workbench
V3.1.0.1
Student Notebook
Uempty
Editors
An editor is used to edit or browse a resource. Modifications made in the editor follow an open-save-close life cycle. An editor can contribute to the Workbench menu bar. Examples: Java Source Editor Web Deployment Descriptor Editor Web Site Configuration Editor JSP Editor WSDL Editor
XM3014.1
Notes:
The key thing to note about editors is the Open-save-close life cycle. You must explicitly save the corresponding resource after making changes.
4-13
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Online Help
To learn more on Workbench, select Help ->Help Contents) Select Application Developer information Select Getting Started Select Workbench Fundamentals and Tutorial: Workbench Basics
XM3014.1
Notes:
Tips : F1, F1 : info pop on a selected task To hide the navigation frame, click the Hide Navigation button on the Help view's toolbar. Note: Your product may include more than one information set (a collection of documentation topics). When you run a search, only the current information set is searched. The current information set is shown in the drop-down list at the top of the Help view. To search another information set, select it from the list, and run the search again.
V3.1.0.1
Student Notebook
Uempty
Cheat Sheets
Guide developer through an application development process Sequence of documented steps with relevant documentation Displayed in workbench pane Task-related tools are automatically launched or have launch icons in cheat sheet Launched via Help Cheat Sheets
XM3014.1
Notes:
4-15
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Reduced learning curve through the consolidation of tooling to one platform. For example, with customizable perspectives, one could customize Application Developer to look similar to other Java IDEs.
V3.1.0.1
Student Notebook
Uempty
Tooling
Java IDE J2EE Tooling Portlet Tooling Data Tooling Web Tooling XML Tooling Performance / Trace Tooling Team Development Tooling Web Services Tooling
XM3014.1
Notes:
4-17
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Java IDE (1 of 3)
Ships with SDK 1.3 Pluggable JRE Support Defined at project and workbench level Hot Method Replace Dynamically replace Java classes during debug Enabled when Application Server V5 runs in debug mode Java Snippet Support (Scrapbook) Task Sheet (All Problems Page) Code Assist Refactoring Support Rename/move support for method/class/package Fix all dependencies for renamed element With and without preview
XM3014.1
Notes:
A default JRE can be selected for the Workbench with Windows-> Preferences. Project specific JRE is selected in the Launch Configuration Dialog. For more on hot method replace, refer to the foil at the end of the unit.
V3.1.0.1
Student Notebook
Uempty
Java IDE (2 of 3)
Faster IDE Smart Compilation No lengthy compile/build/run steps Pluggable Framework, in-placetool launching Running class/code with errors Precise reference searching Text and Java-based JDI-based debugger for local/remote debugging Run code with errors Multiple test environments can be configured J2EE WAR/EAR Deployment
XM3014.1
Notes:
JDI: Java Debugging Interface. The JDI is a high-level Java API providing information useful for debuggers and similar systems needing access to the running state of a Java virtual machine.
4-19
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Java IDE (3 of 3)
UML Class Diagram Editing and Visualization Support for Java classes and EJB components Diagrams generated from existing classes/components New diagrams built and used to develop corresponding component Typical Class Diagram Editor operations: Create classes, packages, and interfaces Create extends and implements relationships Create methods and fields Refactor components Add EJB relationships Add EJBQL queries Add CMP fields to a primary key UML Class Diagrams can be exported
XM3014.1
Notes:
Starting with V5.1, Application developer adds support for UML visualization. You can select an existing components and have the system generate the UML diagrams, or you can start with a blank diagram and develop components from the diagram, or use a combination of the two approaches. These features let developers understand existing components better by producing UML that represents the existing components and also assists them in generating components based on the UML diagrams. The entire class diagram or portions may be exported in bmp, jpg, or gif image formats.
V3.1.0.1
Student Notebook
Uempty
J2EE Tooling (1 of 2)
J2EE 1.3 EJB 2.0 Support Servlet 2.3, JSP 1.2 Support J2EE Perspective provides views and editors for EJB/Servlet/JSP Developer Object-relational Mapping for EJBs Top-down/Bottom-up/Meet-in-the-middle All metadata exposed as XMI No hidden metadata EAR and WEB Deployment Descriptor Editors Forms-based (no need to directly edit XML) Source view also available Struts Support Web Diagram visual editor for application design
XM3014.1
Notes:
4-21
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
J2EE Tooling (2 of 2)
Connector Projects J2EE Connector Architecture (JCA) based EJB Test Client Universal Test Client HTML-based J2EE programming model Built-in JNDI registry Browser Unit Test Environment for J2EE WebSphere Application Server V4 or V5 and Apache Tomcat Create multiple projects with different Server configurations/instances
Allows for versioning of unit test environment Share Unit Test Environment Configuration across developer
XM3014.1
Notes:
WebSphere Studio provides a Web-based Universal Test Client where you can test your Enterprise JavaBeans (EJBs) and other objects. Using this test client, you can test the home and remote interface methods of your enterprise beans. By calling the methods and passing user-defined arguments you can test methods to ensure that they work correctly.
V3.1.0.1
Student Notebook
Uempty
Portlet Tooling
Wizards to create Portlet Application Management of Deployment Descriptors web.xml portlet.xml Multiple portlets per application Integrated development and test environment Full use of debugger Test on remote server or integrated unit test environment Export deployable WAR file
XM3014.1
Notes:
There are actually two related plug-ins. The first, WebSphere Portal Toolkit ships with all offerings of WebSphere Portal V4.x. The second, WebSphere Everyplace Toolkit ships with WebSphere Everyplace Server. The test environment interacts with a developer configuration of WebSphere Portal Server running on WebSphere Application Advanced Single Server Edition (AEs). This is facilitated by the Remote WebSphere Server configuration.
4-23
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Data Tooling (1 of 2)
Data Perspective Provides views geared for DBAs to:
Create Databases Create Tables/Views/Indexes/Keys Generate DDL Connect to and view existing relational database objects
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Data Tooling (2 of 2)
DB2 Stored Procedures Create / Build and Register/ Debug / Drop a stored procedure or User Defined Function (UDF) SQL or Java-based SQLJ Files Create / Build / Debug SQLJ Workbench runs SQLJ translator and builds Java files
XM3014.1
Notes:
4-25
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Web Tooling (1 of 2)
Web Site Designer Provide site-level views of Web project Graphical and detail tabular views of site structure Page Designer Provides page-level view of Web project components HTML and JSP editing WYSIWYG page design, source editing and page preview Choice of static or dynamic Web project Appropriate tool support loaded at project creation time Palette View Provides drawers of useful items for HTML and JSP creation Items are dropped and dragged onto page editor
XM3014.1
Notes:
The Web Site Designer is new with 5.1. The configuration of the entire Web site is maintained in the Web Site Configuration object. The choice of static or dynamic web sites and the Palette view are also newly introduced in release 5.1. Examples of the drawer labels in the Palette view are: HTML, Free Layout, JSP, Java Server pages, and Site Parts. The Site Parts include items such as Vertical and Horizontal Navigation Bars, which help to maintain consistency in the look and feel of pages across the site.
V3.1.0.1
Student Notebook
Uempty
Web Tooling (2 of 2)
Multiple markup types (WML, cHTML) and pervasive device support Built in Servlet, Database, and JavaBean Wizards Built-in JSP Debugging Site Style Sheet and Page Template Support Links View View HTML/JSP and all links reference in page Parsing and link management updates link when resources are renamed or moved Jakarta JSP Taglibs Specify in project Properties or NewProject to include Available: Standard Tag Library (JSTL), accessing JSP objects, database access, internationalization, utilities
XM3014.1
Notes:
4-27
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XML Tooling (1 of 3)
XML Tooling provides integrated tools/perspectives to create XML based components: XML Source Editor
DTD/Schema validation Code Assist for building XML documents
DTD Editor
Visual tooling for working with DTDs Create DTDs from existing documents Generate an XML Schema from a DTD Generate JavaBeans for creating/manipulating XML documents Generate an HTML form from a DTD
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
XML Tooling (2 of 3)
XSL Editor Edit/create and validate XSL XSL Debug and Transformation Tool Trace XSL transformation Examine relationships between the result node, the template rule, and the source node XML to/from Relational Databases Generate XML, XSL, XSD from an SQL Query RDB/XML Mapping Editor Map columns in a table to elements and attributes in an XML document Generate a Database Access Definition (DAD) script to compose/decompose XML documents to/from a database DAD is used with DB2 XML Extender
XM3014.1
Notes:
4-29
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XML Tooling (3 of 3)
XPath Expressions Wizard Create XPath expressions XML to XML Mapping Editor Map one on more source XML files to a single target
XM3014.1
Notes:
XPath expressions can be used to search through XML documents, extracting information from the nodes (such as an element or attribute).
V3.1.0.1
Student Notebook
Uempty
Performance/Trace Tooling
Built-in tooling helps developer isolate and fix performance problems with their Web application Profiling and Logging Perspective allows developers to: Attach to local/remote agents for capturing performance data JVM Monitoring
Heap Stack Class/Method details Object References
Resource Monitors
Execution patterns CPU usage Disk usage
XM3014.1
Notes:
4-31
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Team Development
Workbench integration occurs through a pluggable, adapter-based design: A published framework API allows any SCM provider to add an adapter to integrate their SCM into the Workbench Application Developer ships with CVS Plugin ClearCase LT Plugin
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Create / Transform
Create new Web Services from JavaBeans, databases
Build
Wrap existing artifacts such as SOAP and HTTP GET/POST accessible services Generate Java client proxy to Web Services
XM3014.1
Notes:
4-33
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Test
Built-in test client allows for immediate testing of local and remote Web Services
Publish
Publish Web Services to a UDDI Registry
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Standards Support
EJB 2.0 J2EE 1.2 and 1.3 Servlet 2.3 JSP 1.2 JRE 1.3 Web Services Definition Language (WSDL) 1.1 Web Servers Interoperability (WS-I) Basic Profile 1.0 Apache SOAP 2.3 XML DTD 1.0 10/2000 Revision XML Namespaces 1/99 Version XML Schema 5/2001 Version HTML 4.01 (other levels should work) CSS2 (PageDesigner displays a subset)
Copyright IBM Corporation 2004
XM3014.1
Notes:
4-35
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Review
Name some of the roles in Web application development. What is the name of the Application Developer perspective you would usually use for EJB development? Compare and contrast View, Editor, and Perspective. Name the SCM tools that ship with Application Developer.
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Unit Summary
Having completed this unit, you should be able to see: The concept of Role-Based Development The WebSphere Studio Family The WebSphere Studio Workbench in the context of WebSphere Studio products Basic features of WebSphere Studio Application Developer Major tooling sets provided by WebSphere Studio Application Developer
XM3014.1
Notes:
4-37
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
V3.1.0.1
Student Notebook
Uempty
5-1
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Unit Objectives
After completing this unit, you should be able to: Describe the reasons for using DTDs Define well-formed versus valid documents Define the grammar rules for an XML document using DTDs Describe the difference between non-validating and validating processors Describe examples of DTDs being used in business Describe best practices used in DTDs Define the limitations of DTDs Describe the status of the DTD in the industry
XM3014.1
Notes:
5-2
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
This is a quick review of the important rules for XML well-formedness. It's important to recognize that the well formedness rules are very simple.
5-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Can only have two specific children (greeting, farewell). The greeting child must precede the farewell child. Message may have an optional urgent attribute. What if we want to define and publish the structure an XML document is to conform to? What if we want the computer to be able to verify that an XML document meets these kinds of constraints? What if we want to have reusable pieces of text between two XML documents?
XM3014.1
Notes:
The difficulty with well-formedness is that the rules are very simple. Quite often we want to express more complicated constraints such as: The element <message> can only have two children, <greeting> and <farewell>, and the two children must appear in that order The element <message> may have an optional urgent attribute? What if we want the computer to be able to verify that an XML document meets these kinds of constraints? What if we want to have reusable pieces of text between two XML documents?
5-4
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
What Is a DTD?
Blueprint of a document's structure. Contains a series of declarations. DTDs Can be a separate file from the XML document. Can be embedded within the XML file. Can be split between a separate file and the XML file. DTDs define: The elements that can or must appear. How often the elements can appear. How the elements can be nested. Allowable, required and default attributes. But note: the use of DTDs is optional. An XML document that obeys the rules in a DTD is said to be valid.
Copyright IBM Corporation 2004
XM3014.1
Notes:
A Document Type Definition is essentially the framework or skeleton of an XML document. It defines which elements are allowed, which attributes are allowed for each element, and whether such elements or attributes are required or optional. XML Schemas (often referred to as Schemas) extend the functionality of the DTD by adding data typing and other enhancements. An XML document that conforms to its specified DTD or XML Schema is said to be valid. The DTD can be a separate file or it can also be embedded in the XML file. In fact, the DTD contents can be split across an external file and the XML file.
5-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Similar material can also be found in the WSAD IE 5.1 help file for DTD. This page and the next list the elements you may use in a DTD file.
5-6
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
5-7
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Here's a simple example of an XML document on the left, and the DTD rules that describe it on the right. We're not going to go into the details of the rules right here -- that's what the rest of this unit is about. We just wanted you to have an idea of how an XML file and it's related DTD might look.
5-8
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
Here's our introduction to declaring elements. An element declaration begins with <!ELEMENT followed by the name of the element being declared and then the content model for the element. Here's a sample declaration for an element called greeting that accepts #PCDATA (text), along with two <greeting> elements that are valid according to this declaration. The second <greeting> element is using a CDATA section to quote its contents. Remember, element names must start with a letter or underscore, however, the letters xml, xsl, xsi and xsd are reserved (regardless of case) by the W3C; future development may reserve other "x--" prefixes. The colon character is also reserved (see Unit 5. Namespaces), a period or alphanumeric characters may follow the first character (while technically legal, an underscore-period combination is not recommended). #PCDATA (parsed character data) indicates that only text and entities can be included in the element. This data will be examined by the parser for entities and markup. Parsed character data cannot contain the characters "&", "<", or ">"; these need to be represented by their respective entities (Refer to the slide Built-in Entities).
Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-9
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
The content is the stuff in between the element's start and end tag. There are four types of content models in XML 1.0 DTDs. Types of DTD Content models EMPTY ANY Element only - this includes child elements Mixed - this includes child elements and text
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
The EMPTY content model is used for an element that will have no content whatsoever. Note that such an element may have as many attributes as it likes. To specify the EMPTY content model, provide the word EMPTY for the content model. The two examples on the foil show two elements that are valid with an empty content model. Empty elements are not much use unless they have attributes. We'll learn more about declaring attributes in a bit. An EMPTY element can be very useful for testing snippets of XML. There is an example of this later in this chapter.
5-11
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Contrary to what you might expect, the ANY content model does not allow you to put anything you like between the start and end tag. When you use the ANY content model, you must supply well-formed xml if what you supply has markup in it. Moreover, the elements that you use must be declared in the DTD as well. So for the third example on the foil, the <galaxy> element must be declared in the DTD for the document. To specify the ANY content model provide the word ANY for the content model.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
If the content of an element consists solely of child elements, the element is said to have element content. The element content model is specified by content model particles that are combinations of either element names or other content model particles. The table describes the operators that can be used to form these combinations. In the table, a or b can be either content particles or element names. To create the content model of a followed by b, use the comma (,). To create the content model of a or b, use the vertical bar (|). To repeat a content particle at least once, use the (+). To repeat a content particle zero or more times, use the (*). To allow a content particle to be absent or present exactly once, use the (?).
5-13
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Valid XML:
<person> <lname>Smith</lname> <fname>John</fname> </person>
XM3014.1
Notes:
The first example specifies that <person> has a content model that accepts an <fname> followed by an <lname> or an <lname> followed by an <fname>. The matches show all the possible permutations.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
The second example specifies that an <order> is a sequence of at least one <order-item> followed by a <delivery-address>, followed by an optional <order-date>. The valid XML shows 1. One <order-item>, a <delivery-address> and no <order-date>. 2. Two <order-items> a <delivery-address> and no <order-date>. 3. Two <order-items>, a <delivery-address> and an <order-date>.
5-15
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
This example says that a phone book is at least one <entry>, <column-heading> or <page-number>, but that there may be more than one of any of these three, and that they may appear in any order. The valid XML shows show: 1. Three <entry>'s. 2. Two <column-headings>. The invalid example is invalid because page-number cannot have entry as a child.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
Elements that have the mixed content model can contain (parsed) character data. In addition to the character data, mixed content models may also contain child elements interspersed with the character data. If a mixed content model contains child elements, it can specify which elements may appear, but the child elements can appear in any order, and any number of times. The valid XML shows: 1. An element with character data content only. 2. An element allowing a single child element in addition to the character data content.
5-17
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
The syntax for declaring attributes looks like this: <!ATTLIST followed by elementName - the name of element we are declaring that attribute for. attributeName - is the name of the attribute being declared. attributeType - specifies the data type (see Attribute Type table). attributeDefault - specifies the attribute's default behavior. To declare multiple attributes, you can write multiple ATTLIST declarations or repeat the (attributeName attributeType attributeDefault) part as necessary.
V3.1.0.1
Student Notebook
Uempty
Organizational Note
The next several charts identify possible choices for each syntactical piece on the previous chart These charts are followed by examples We then continue with the concepts identified in the "What is allowed in a DTD" chart: ENTITY ENTITIES NOTATION Our intent is to provide you with solid, tested examples you can use on your own projects The "XM301Lectures" folder on your desktop contain working examples you can try (or use) on your own
XM3014.1
Notes:
5-19
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Attribute Types
Attribute Type String Type
Description
Used to declare an attribute whose value may contain arbitrary character data. Whitespace crunching is not done. This is the only attribute type permitting attribute values that do not match the NAME production in the XML 1.0 grammar. Used to declare an attribute whose value must conform to the definition of a NAME in XML 1.0 Allows multiple NMTOKENs separated by white space. Used to declare an attribute whose value must be a unique within the XML document. The value of the attribute must refer to an ID value declared elsewhere in the document. IDREFS? See NMTOKENS Used to declare an attribute whose value must correspond to the name of a declared ENTITY. Allows multiple ENTITY names separated by whitespace. References a <!NOTATION declaration in the DTD. Attributes have a specified list of acceptable NMTOKEN values.
Copyright IBM Corporation 2004
CDATA Tokenized Type NMTOKEN NMTOKENS ID IDREF, IDREFS ENTITY ENTITIES Enumerated Type NOTATION ENUMERATION
Figure 5-19. Attribute Types
XM3014.1
Notes:
CDATA attributes contain character data. Whitespace crunching is not performed. We covered this on previous charts. The ID data type contains a string value that must be unique to each element. No element type may have more than one ID attribute specified, although the declared ID attribute may be #IMPLIED or #REQUIRED. ID valued attributes can be combined with IDREF and IDREFS valued attributes to create cross referencing within an XML document. IDREF's must contain values which are specified in an ID-valued attribute elsewhere in the document. IDREFS are a space separated list of ID values. ENTITY and ENTITIES are the name or a space separated list of entity names. (More on entities in a moment). NMTOKENs are strings composed of the legal characters in an XML element name -- they are not the same as XML element names, because the first character of an XML element name may not contain some of the characters that are legal as the first character of an NMTOKEN.
5-20 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
Uempty
NMTOKENS are space separated lists of NMTOKENs. These were introduced earlier. NOTATION valued attributes must contain the name of a NOTATION declared elsewhere in the document. (More on NOTATION later). Enumerations are lists of NMTOKENs separated by the vertical bar (|) and enclosed in parentheses. An attribute with an enumeration value must contain one of the NMTOKENs in the list.
5-21
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Description The attribute must be present The attribute does not need to be present and no default value was supplied If the attributes value is not present, "attribute-value" is supplied as a default value If the attribute is present, it must have the value of "attribute-value"
#FIXED attribute-value
XM3014.1
Notes:
Every attribute must specify a default type. The possible values for the default type are: #REQUIRED: Indicates that the attribute must occur; the value may be enumerated or fixed. #IMPLIED: Indicates that the attribute or the attribute's value can remain unspecified; #FIXED value: Indicates that this attribute, when used, has a single (fixed) value, this value must appear immediately after the keyword and be in quotes. enumerated list: gives a list of choices in parentheses, each separated by an "or" operator. A default value (from the enumerated list) may be given after the list and must be in quotes. If a default value is declared, when the attribute is not present, the element is treated as if the attribute were present with the declared default value.
V3.1.0.1
Student Notebook
Uempty
Valid XML:
<shirt type="short">cotton</shirt> <shirt type="short" size="large">wool</shirt> <shirt type="short" manufacturer="Levi">denim</shirt> <shirt type="short sleeve" collar="button-down"></shirt>
Invalid XML:
<shirt></shirt> <shirt type="short" size="medium large">cardigan</shirt> <shirt type="short" manufacturer="Gap">designer</shirt>
XM3014.1
Notes:
Here we've declared a few attributes with the various default types. Size has a default value, type is required, and manufacturer is fixed. Let's look at how the examples come out: For the valid examples: <shirt type="short"/> will also pickup the default value "large" for size, and the fixed value "Levi" for manufacturer <shirt type="short" size="large"/> will pick up the fixed value "Levi" for manufacturer For the invalid examples: <shirt/> is missing the required "type=" attribute <shirt type=short size="medium large"/> is invalid because "medium large" isn't in the enumerated value list for size <shirt type="short" manufacturer="Gap"/> is invalid because "Gap" isn't the fixed value for manufacturer
Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-23
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Declaration:
<!ELEMENT <!ATTLIST <!ATTLIST <!ATTLIST employee employee employee employee (#PCDATA)> serialNumber ID #REQUIRED> manager1 IDREF #IMPLIED> manager2 IDREFS #IMPLIED>
XM3014.1
Notes:
This foil shows a declaration for an implied attribute of type IDREFS. According to the syntax rules for IDs, numbers cannot be ID's. That is why the serialNumber values begin with a letter. Aside from naming rules, manager2 could have any value as long as there is an element with that value defined. Consequently, an employee could be self-managed! The uniqueness constraint applies to IDs not to IDREFs so the employee could be self-managed twice: both manager1 and manager2 could have the same value.
5-25
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Declaration:
<!ELEMENT employee (#PCDATA)> <!ATTLIST employee companyName ENTITY #REQUIRED> <!ENTITY company SYSTEM http://www.IBM.com/company.txt" NDATA txt> <!NOTATION txt SYSTEM "file:///C:/Windows/System32/notepad.exe">
ENTITY is also used in its own right as another element of a DTD; this is covered in subsequent charts. Here we focus on ENTITY as an attribute. NDATA and NOTATION are concepts we have yet to discuss.
The material above is included here to provide an example for future reference.
Copyright IBM Corporation 2004
XM3014.1
Notes:
This foil shows a declaration for an implied attribute of type ENTITY. As you can see there are several concepts involved that we have yet to discuss. Not the least of which is "what is an 'entity'?" You will find this and the next chart useful on the job when you need to create or understand a DTD that uses these concepts. The concepts themselves are described on subsequent charts.
V3.1.0.1
Student Notebook
Uempty
Declaration:
<!ELEMENT employee (#PCDATA)> <!ATTLIST employee companyAtts ENTITIES #REQUIRED> <!ENTITY company "IBM"> <!ENTITY division "19"> <!ENTITY branch "http://www.ibm.com/IGS.txt">
XM3014.1
Notes:
ENTITIES provide a mechanism for including data from multiple sources. As you can see there are several concepts involved that we have yet to discuss. You will find this and the next chart useful on the job when you need to create or understand a DTD that uses these concepts. While DTDs may be lacking in several important aspects (listed later), they can still be very complex! Like the ENTITY example, we need to define several concepts for this chart to be understood. The explanations follow.
5-27
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
DTDs Part II
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Usage:
&entityName;
Declaration:
<!ENTITY xmlExpert "Ron Smith"> <!ENTITY topic "XML Documents">
Valid XML:
<response>For additional help with &topic;, Please contact &xmlExpert;.</response>
Processed XML:
For additional help with XML Documents, Please contact Ron Smith.
XM3014.1
Notes:
Here is an example. But we just told you that entities are related to separate storage units, and the entity declaration that we just saw fit completely into the DTD. This kind of entity is called an internal entity and is not associated with a separate physical storage unit. Let's look at how to declare the same entity as an external entity, in a separate physical storage unit.
5-29
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Declaration:
<!ENTITY copyrightInfo SYSTEM "file:///c:/legal/boilerplate.txt">
boilerplate.txt file:
Copyright 2003, IBM. All rights reserved.
Valid XML:
<notices>This application was developed using WebSphere Studio. ©rightInfo;</notices>
Processed XML:
This application was developed using WebSphere Studio. Copyright 2003, IBM. All rights reserved.
Copyright IBM Corporation 2004
XM3014.1
Notes:
In this case where the entity defines a public URI, the parser must understand how to handle the "publicURI" identifier. This is traditionally only used when the parser provided was hard-coded to handle it, or if you will be creating your own parser to handle entity replacement. According to 4.2.2 (External Entities) of the XML 1.0 specification: "Definition: In addition to a system identifier, an external identifier may [emphasis added] include a public identifier. An XML processor attempting to retrieve the entity's content may [emphasis added] use the public identifier to try to generate an alternative URI reference. If the processor is unable to do so, it must [emphasis added] use the URI reference specified in the system literal...." Here is their example: <!ENTITY open-hatch SYSTEM "http://www.textuality.com/boilerplate/OpenHatch.xml"> <!ENTITY open-hatch PUBLIC "-//Textuality//TEXT Standard open-hatch boilerplate//EN"
5-30 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
Uempty
"http://www.textuality.com/boilerplate/OpenHatch.xml"> <!ENTITY hatch-pic SYSTEM "../grafix/OpenHatch.gif" NDATA gif > Find out more at: http://www.w3.org/TR/2003/PER-xml-20031030/ Be aware that an external entity may not recursively reference itself, either directly or indirectly. More examples follow.
5-31
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Declaration:
<!NOTATION jpeg SYSTEM "file:///c:/Program Files/Photoshop/photoshop.exe"> <!ENTITY prod17792 SYSTEM "prod17792.jpg" NDATA jpeg> <!ELEMENT item EMPTY> <!ATTLIST item picture ENTITY #REQUIRED>
Valid XML:
<item picture='prod17792'/> Rules: Unparsed entities can only be external entities. In order to declare an unparsed entity, you start with a regular external entity declaration and before the closing angle bracket you insert NDATA and the name of a notation. This associates a notation name with the unparsed entity. To reference an unparsed entity, you can use its name in an ENTITY or ENTITIES valued attribute. You cannot reference an unparsed entity by &name;.
Copyright IBM Corporation 2004
XM3014.1
Notes:
Here's an example of unparsed entity use: First we declare a notation called jpeg and associate it with a photoshop.exe somewhere on the local machine. Then we declare an external unparsed entity called prod17792 and add the NDATA jpeg clause to specify the notation. The rest of the DTD declares an empty element item with an ENTITY valued attribute called picture. You can see in the XML instance document that we supply prod17792 (the name of the entity) as the value of the picture attribute of item. This is how you can associate a piece of unparsed/binary data with a portion of an XML document.
V3.1.0.1
Student Notebook
Uempty
Parameter ENTITYs
Parameter entities: Can only be used in the DTD Allows reuse of attribute lists and complex type definitions Syntax:
<!ENTITY % parameterEntityName "replacementText">
Usage:
%parameterEntityName;
Declaration:
<!ENTITY % commonAtts "make CDATA model CDATA <!ELEMENT phone (#PCDATA)> <!ATTLIST phone %commonAtts type (rotary | touch-tone) #IMPLIED #IMPLIED"> #IMPLIED>
Processed DTD:
<!ELEMENT phone (#PCDATA)> <!ATTLIST phonemake CDATA #IMPLIED model CDATA #IMPLIED type (rotary | touch-tone) #IMPLIED>
Copyright IBM Corporation 2004
XM3014.1
Notes:
The parameter entity replacement works like regular entity replacement. The parser will substitute the replacement text, and then continue evaluating the DTD from the point of replacement. Parameter entities are entities that are meant to be used in the DTD. Parameter entities are very useful if you want to reuse portions of an attribute list declaration or if you want to reuse parts of a complex content model specification. Parameter entities are the primary tool that is available to help you structure a complex DTD.
5-33
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
In this example the commonAtts parameter entity is used to represent common attributes for the three different elements: car, computer and phone.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
To insert a comment in a DTD (or an XML document for that matter) place the comment text inside <!-- and -->. Comments cannot be nested. The space after the <!-- is required, as is the space before -->. The characters "--" may not be used within the comment. This form of declaration is also usable within HTML, XML and XSL documents.
5-35
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Overriding (changing) the data contained in an XML instance may cause confusion for other users of the instance. The application of an XSL transform or a processor program (for example, DOM, SAX, or similar) may be a better alternative.
V3.1.0.1
Student Notebook
Uempty
Filename: message.dtd
<!ELEMENT message (greeting,farewell)> <!ELEMENT greeting (#PCDATA)> <!ELEMENT farewell (#PCDATA)>
XM3014.1
Notes:
Up until now we've described some of the contents of a DTD without showing how to actually place those declarations in a file so that they can be used to validate a document. Recall that the DTD may be in an external file, embedded directly in an XML file, or split across an external file and the XML file. Let's look at placing the DTD declarations in an external file. The part of the DTD that goes into the external file is called the external DTD subset. The external DTD subset is an entity even though DTD declarations are not elements. Therefore you need to supply a text declaration at the beginning of the external DTD subset. This is especially important if the document and the DTD are going to be using different character encodings. In the example below, the file message.dtd contains the declarations of three elements, message, greeting and farewell. The DTD may have it's own encoding declaration (which may be different from the encoding of documents that reference the DTD file). The file hello.xml references this DTD using a DOCTYPE declaration. The DOCTYPE declaration specifies the name of the root element of the document, message in the
Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-37
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
example. Note that the DOCTYPE declaration is what specifies the root element, not the DTD subset. This means that potentially any element declaration in the DTD can serve as the root element. It is up to the DOCTYPE writer to specify this. Following the name of the root element is the keyword SYSTEM followed by a URI reference that the local machine can use to locate the actual file containing the external DTD. Using an external file allows you to easily use the same DTD to validate many documents.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
Up until now we've describe some of the contents of a DTD without showing how to actually place those declarations in a file so that they can be used to validate a document. Recall that the DTD may be in an external file, embedded directly in an XML file, or split across an external file and the XML file. Let's look at the placing DTD declarations in an external file. The part of the DTD that goes into the external file is called the external DTD subset. The external DTD subset is an entity even though DTD declarations are not elements. Therefore you need to supply a text declaration at the beginning of the external DTD subset. This is especially important if the document and the DTD are going to be using different character encodings. In the example below, the file message.dtd contains the declarations of three elements, message, greeting and farewell. The DTD may have its own encoding declaration (which may be different from the encoding of documents that reference the DTD file). The file hello.xml references this DTD using a DOCTYPE declaration. The DOCTYPE declaration specifies the name of the root element of the document, message in the
Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-39
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
example. Note that the DOCTYPE declaration is what specifies the root element, not the DTD subset. This means that potentially any element declaration in the DTD can serve as the root element. It is up to the DOCTYPE writer to specify this. Following the name of the root element is the keyword SYSTEM followed by a URI reference that the local machine can use to locate the actual file containing the external DTD. Using an external file allows you to easily use the same DTD to validate many documents.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
The example on this foil shows a DTD with an entity called destination in both the internal and external subsets. The declaration for destination in the internal subset will override the declaration in the external subset, leaving the messages "Hello cruel world" and "good-bye cruel world" after entity expansion has occurred. This allows local entity declarations in the internal subset to override entity declarations in the external subset. A best practice would be to include a comment drawing attention to the intent of this internal subset to override a value set in the external subset.
5-41
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Whitespace is white space isn't it? Not if you are a validating XML processor. There are two kinds of white space: Whitespace in #PCDATA element content (between the same start and end tag pair) you only know this if you have a DTD Whitespace in non-character data content Whitespace not in #PCDATA data element content is ignorable Parsers report whitespace and ignorable whitespace differently. The parser does not actually discard the ignorable white space -- this is the application's job. But the parser can use different data structures / callback routines in order to report ignorable versus not ignorable whitespace.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
This slide shows an example XML document and DTD, and shows which whitespace is ignorable and which whitespace is not. Again, it is up to the application to decide what to do about ignorable whitespace. An XML processor will report all of the whitespace and indicate whether or not it is ignorable or note.
5-43
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Validating processors are straightforward. The XML spec tells implementors exactly what a validating processor must do (in fact, they must do everything). Non-validating processors have options because the XML spec says that a non-validating processor may do certain things, but is not required to do them. Unfortunately, every parser implementor has chosen a different subset of items from this list to implement, so every non-validating parser behaves just a little differently. A non-validating processor must check the document entity including the internal subset. If there is an external DTD subset, they may or may not: normalize attribute values from the external subset replace internal entity text from the external subset supply attribute defaults from the external subset Since the behavior of non-validating processors is up to implementors, you need to be careful when working with a non-validation processor if you have complicated attribute values or use entities.
5-44 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
Uempty
Example DTDs
W3C XHTML cXML B2B between procurement applications, e-commerce hubs and suppliers. RosettaNet Business processes between trading partners and properties for defining products. RDF Site Summary (RSS) Syndicating news articles. DocBook Production of documentation which can be rendered into multiple output formats. Open Financial Exchange(OFX). Electronic exchange of financial data.
XM3014.1
Notes:
Many organizations are producing DTD's for various applications. Here some examples: cXML - http://www.cxml.org - cXML is a streamlined protocol intended for consistent communication of business documents between procurement applications, e-commerce hubs and suppliers. The current standard includes documents for setup (company details and transaction profiles), catalogue content, application integration (including the widely-used PunchOut feature), original, change and delete purchase orders and responses to all of these requests, as well as new order confirmation and ship notice documents (cXML analogues of EDI 855 and 856 transactions). RosettaNet - http://www.rosettanet.org - RosettaNet Partner Interface Processes (PIPs) define business processes between trading partners. RosettaNet dictionaries provide a common set of properties for PIPs. The RosettaNet Business Dictionary designates the properties used in basic business activities. RosettaNet Technical Dictionaries
Copyright IBM Corp. 2001, 2004 Unit 5. Document Type Definition (DTD) 5-45
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
provide properties for defining products. Product and partner codes in RosettaNet standards expedite the alignment of business processes between trading partners. RSS - http://www.purl.org/rss/1.0 - The RDF Site Summary format was originally developed by Netscape and is widely used across the World Wide Web for the purpose of syndicating news articles. DocBook - http://www.docbook.org - DocBook is an XML version of the SGML DocBook DTDs that are widely used in the production of documentation which can be rendered into multiple output formats. OFX - http://www.ofx.net - Open Financial Exchange is a unified specification for the electronic exchange of financial data between financial institutions, business and consumers via the Internet. Created by CheckFree, Intuit and Microsoft in early 1997, Open Financial Exchange supports a wide range of financial activities including consumer and small business banking; consumer and small business bill payment; bill presentment and investments, including stocks, bonds and mutual funds.
V3.1.0.1
Student Notebook
Uempty
Governs the grammar of the entire document. Can't localize grammars to specific fragments.
XM3014.1
Notes:
There are number of problems with DTD's, which are listed on the chart. These problems have led to the creation of a number of alternate languages for defining the structure of XML grammars. The two leading contenders are W3C's XML Schema, and OASIS's Relax NG.
5-47
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Status of DTDs
Part of XML 1.0 Widely adopted Variations in XML processors in accordance with varying definition of non-validating
XM3014.1
Notes:
DTD's are a part of the XML 1.0 recommendation. They are a stable technology and widely adopted. As we noted earlier there are variations in XML processors in accordance with varying definition of non-validating. Most XML parsers available today come with the capability to use DTDs to validate documents. XML Schema is the W3C approved replacement for DTD's, but this is a new technology and has not reached broad usage at the time of this writing.
V3.1.0.1
Student Notebook
Uempty
Tooling
Can use any text editor As long as the editor supports Unicode or the chosen encoding. WebSphere Studio Application Developer Provides guided editing for DTDs and documents that reference them Can generate a DTD from sample XML. Write sample XML that illustrates all the ways you'll use the data Supports document validation Free IBM Alphaworks tools to help you http://www.alphaworks.ibm.com/tech/xmlsqc Many validating parsers: Apache's Xerces for Java, C++, Perl Apache's Xerces Perl JAXP, Java XML Parser
XM3014.1
Notes:
The Tooling for DTDs is pretty simple at the base. You can use the same editor that you use to edit an XML file to edit a DTD. They are the same kind of text. There are also many tools for working with DTD's. IBM's alphaworks has a number of useful tools. The commercially available XML Spy is a popular graphical tool for working with XML, DTD's and XML Schema. There are many parsers that perform validation using a DTD. This is true of all of the parsers available from the Apache Software Foundation.
5-49
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Checkpoint Questions (1 of 2)
1. Which DTD entry correctly depicts phone number, with optional area code? a. <!ELEMENT phone ((areaCode)*, prefix, body)> b. <!ELEMENT phone (areaCode?, prefix, body )> c. <!ELEMENT phone?(areaCode, prefix, body )> d. <!ELEMENT phone (areaCode, (prefix, body)+)> 2. Which of the following is a limitation of DTD? a. Non-XML syntax. b. Does not easily allow range of values (that is, 5 to 1000 elements). c. Does not provide proper typing of values (that is, integer versus string). d. Does not permit Parameter Entity references. e. All of the above.
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Checkpoint Questions (2 of 2)
3. Which DTD entry correctly depicts an optional attribute named type for a pet element, that defaults to the value "dog"? a. <!ATTLIST pet type CDATA #IMPLIED> b. <!ATTLIST type dog CDATA #FIXED "dog"> c. <!ATTLIST pet type CDATA "dog"> d. <!ATTLIST pet (dog)? CDATA #REQUIRED>
XM3014.1
Notes:
5-51
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Unit Summary
In this section you have learned: XML 1.0 DTDs Element declarations Attribute declarations Entity declarations
General Parameter
Notation declarations Comments The difference between validating and non validating processors Example DTDs Best Practices
XM3014.1
Notes:
In this section you have learned about: XML 1.0 DTD's Element declarations Attribute declarations Comments Entity declarations General Parameter Notation declarations The difference between validating and non validating processors Example DTD's Best Practices
5-52 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
Uempty
5-53
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
V3.1.0.1
Student Notebook
Uempty
6-1
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Unit Objectives
After completing this unit, you should able to: Describe the reasons for using namespaces Describe the syntax used in namespaces Define and illustrate an example using namespaces Define problems with namespaces List and define the best practices to use when using namespaces Describe the status of namespaces in the industry
XM3014.1
Notes:
6-2
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
How does an application know that: The first occurrence of title is a book title. The second occurrence of title is a person's title. Need a way to eliminate the ambiguity for the purpose of processing.
Copyright IBM Corporation 2004
XM3014.1
Notes:
The double use of title in the example illustrates the need for a namespace solution in XML. We need to be able to tell that the two title elements in this document are not the same element. Even though the elements have the same name, they have different meanings to the application. Using the context to disambiguate the two uses is not a generally applicable solution.
6-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Elaboration
Some possibilities: Adopt industry standard document formats and naming conventions This approach works at the document level, a good example is ebXML, refer to http://www.openapplications.org Problems: No industry is an island, industries interact: who decides? Naming standards down to the element/attribute level are too brittle Use verbose element names, that is, bookTitle, courtesyTitle Problem: naming becomes fundamentally difficult, there is no way to know if a name is already in use, further, the data and/or its model may not belong to the consuming application. Solution Use some name qualifier that is already established as unique, that is, a domain-name-qualified URI (uniform resource identifier). Domain names are already managed and maintained as unique. This approach was developed into XML Namespaces.
XM3014.1
Notes:
URI's are not actually used for lookup, only as reference. The only purpose is to give the namespace a unique name. Sometimes the URI is a pointer to a web page, which provides information about the namespace, but this is not required. The URI is not looked up as part of XML parsing or processing. The application is responsible for deciding what to do with the names.
6-4
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
URI, recall, is uniform resource identifier.
6-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XML Namespaces
For the purposes of XML namespaces, URIs are considered identical when they match character for character. If URIs are different, they represent different Namespaces. Note: There is no network lookup associated with the use of URIs in this specification, it is a lexical convention only. URIs are not checked by the processor to ensure they exist. The Namespace specification deals with the mechanics of associating a URI qualifier (aka namespace) with element and attribute names to create two-part names that are unique and free of ambiguity.
The Namespace specification refers to these two-part names as Qualified Names or QNames
Copyright IBM Corporation 2004
XM3014.1
Notes:
URI's are not actually used for lookup, only as reference. The only purpose is to give the namespace a unique name. Sometimes the URI is a pointer to a web page, which provides information about the namespace, but this is not required. The URI is not looked up as part of XML parsing or processing. The application is responsible for deciding what to do with the names.
6-6
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
prefix localpart
At all times, the prefix should be thought of as shorthand for the actual URI/namespace. That is, the above is really <http://www.library.com/books:title>
XM3014.1
Notes:
You can think of a QName like <books:title> as being equivalent to the following Clark Notation: {http://www.books.com/books}title
6-7
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Declaring Namespaces (1 of 2)
The syntax of a namespace declaration is:
<prefix:elementName xmlns:prefix='URI'/>
The following example declares the namespace http://www.library.com/books, assigns it a prefix of 'books' and identifies the book element as a member of that namespace.
<books:book xmlns:books='http://www.library.com/books'/>
Attributes may also be assigned to a namespace. As with elements, attributes are prefixed as follows:
XM3014.1
Notes:
Note that you can declare a namespace on any element that you like, not just the root element.
6-8
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
Declaring Namespaces (2 of 2)
Suppose a document without namespaces looked like:
<book hardcover='true'> <title>Tom Sawyer</title> </book>
It is clear that declaring the namespace on every single element becomes unwieldy (and error prone).
XM3014.1
Notes:
Now let's look at example with nested elements.
6-9
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Namespace Scope
When a namespace prefix is declared, it remains in scope for: Attributes of the element where it is declared. Child elements (and their attributes) of the element where it is declared. Unless the prefix is redefined on a nested element. QNames are still required, the namespace is not assumed. Applying this technique, the previous example becomes:
<books:book xmlns:books='http://www.library.com/books' books:hardcover='true'> <books:title> Tom Sawyer </books:title> </books:book>
XM3014.1
Notes:
Note that every element or attribute name that is in the namespace has the appropriate namespace prefix in front of it.
V3.1.0.1
Student Notebook
Uempty
Default Namespaces
For situations where a majority of elements are associated with the same Namespace, a default namespace may be declared.
XM3014.1
Notes:
Once you have specified the default namespace, all unprefixed elements in the scope of the default declaration are assumed to be in the namespace specified as the default. It is very important to note that default namespace declarations only apply to element names, not attribute names. In our example, we set the books namespace to the default and get rid of all the prefixes on element names. We still need the prefix on the attribute names because default namespaces don't apply to attributes.
6-11
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
The result of these apparent duplications is to put the hardcover attribute inside a namespace.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
All that we did to enable this was add two more namespace declarations, and then add the new elements and use the appropriate namespace prefix. In the case of the isbn element, we declared the namespace that it needed on the element itself -- you can declare namespaces on any element that you like, not just the root element. When you do this, the prefix is only good for the element it was declared on. You can also change the default namespace for a particular element by redefining the default namespace on that element. Again, the scope will be the element that the declaration is attached to.
6-13
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
The xmlns="" syntax resets the default namespace for the scope in which it occurs. The <isbn> element is not in a namespace.
XM3014.1
Notes:
The unprefixed <title> element is in no namespace, because there is no default null namespace. In order to repair this example, we need to prefix title with the books namespace prefix again. WRONG!
V3.1.0.1
Student Notebook
Uempty
<good xmlns:ns1="http://www.w3.org" xmlns="http://www.w3.org" > <valid a="1" b="2" /> <valid a="1" ns1:a="2" /> </good>
XM3014.1
Notes:
There are two interacting rules that affect attributes and namespaces: Attributes are not affected by a default namespace declaration. Attributes on a single element must be unique. In the example above, the <bad> element is invalid because there are two unprefixed att attributes. In the second invalid element the two attributes are the same because ns1 and ns2 are two prefixes for the same namespace URI. Therefore, the two attribute names are identical. It should be obvious that the first <valid> element is valid -- a and b are unprefixed, and a is not the same as b. The second <valid> element is valid because the unprefixed attribute a is in no namespace (remember that default namespace declarations don't affect attributes), and the ns1:a attribute is in the http://www.w3.org namespace -- they are in different namespaces.
6-15
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Namespace Processing
How does an XML parser deal with namespaces? Needs the right API
SAX2 DOM Level 2
The parser simply reports the prefix, localName, and URI associated with the element or attribute. It's up to your application to decide what to do. There are no validation rules associated with Namespaces - it depends on XMLSchema, DTD, or whatever grammar description language you are using.
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
Here's an example of namespaces in use: Here we have an imaginary record that might be used in an airline's airplane fleet inventory. For each airplane, we want to know which manufacturer provided each major part of the airplane. This example shows how we could use namespaces to identify which components came from which manufacturers. An application that processed this document could then use the namespaces to determine which manufacturer's diagnostic equipment would be needed to perform a full maintenance cycle on a particular airplane. While not required, it is a best practice to collect all the namespace definitions in one place; especially in large, composite files.
6-17
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Namespace recommendation after XML 1.0 - because the namespace recommendation came after XML 1.0, it's not really part of the spec. This means there are places where namespaces and XML 1.0 don't fit together. DTD's don't really integrate well - We've showed you an ad hoc solution for using a fixed set of namespaces with a DTD, but that solution doesn't really satisfy a lot of desires that users have for namespaces. Testing equality of namespaces is a pain - there's no easy way to test equality of two namespaces except to get the two namespace URIs and compare them character by character.
V3.1.0.1
Student Notebook
Uempty
Best Practices
When to use namespaces When the data requires uniqueness for application processing. When the need to combine a schema [TBD] with other grammars is necessary. Performance implications Namespace processing may slow down the parser and/or increases memory use. Don't use relative URIs for namespace identifiers. Pick the default namespace carefully. Don't declare more than one prefix for a namespace URI. Be careful with attributes when using namespaces. Collect the namespace declarations in one place, preferable near the top of the document.
XM3014.1
Notes:
When to use namespaces: When you think your DTD/Schema will be used outside your organization. When you think you will need to combine your DTD/Schema with other grammars. As a practical note, this means that anybody doing serious grammar work really ought to be using namespaces. Performance implications: Namespace processing slows down the parser and increases memory usage. The parser needs to look at all the namespace declarations and QNames. Even if you turn off namespace processing in your parser, there will still be a performance impact because your input document will still be larger (because of namespace declarations and QNames) than if you were not using namespaces. Don't use relative URIs for namespace identifiers; they are deprecated post the namespaces recommendation.
6-19
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Pick the default namespace for an element carefully - this can save a lot of work if you choose carefully. Don't declare more than one prefix for a namespace URI - there's no reason to do it and it will cause confusion to someone else. Be careful with attributes when using namespaces - remember that default namespaces do not apply to attributes.
V3.1.0.1
Student Notebook
Uempty
Status of Namespaces
XML namespaces became a recommendation of the W3C on January 14, 1999. Supported by SAX2 and DOM2 parsers relative to DTDs. Much better support with XML Schema.
XM3014.1
Notes:
Namespaces in XML Recommendation 1/1999 - it is a stable recommendation. Supported by most parsers relative to DTDs. Much better support with XML Schema. Namespaces are ready for use, especially now that XML Schema has reached recommendation status.
6-21
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
More Information
Reference
http://www.rpbourret.com/xml/ NamespacesFAQ.htm http://www.xml.com/pub/a/2000/03/08/ namespaces/index.html http://www.jclark.com/xml/xmlns.htm http://www.w3.org/TR/REC-xml-names/
Description
XML Namespaces FAQ XML.com article about Namespace Myths James Clark's notes on XML Namespaces The XML Namespaces specification
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Checkpoint Questions
1. Which is true of XML namespaces? (Select all that apply) a. They are stored in an Internet-based registry. b. They are associated with URIs c. They are integrated with DTDs d. They are integrated with XML Schema. 2. An XML namespace prefix (Select all that apply): a. Links to a schema definition. b. Is scoped to the element where it is defined. c. Is short hand for a URI. d. Can stand for more than one namespace. 3. Default namespaces apply to: a. Elements b. Attributes c. Elements and attributes d. Neither elements nor attributes
XM3014.1
Notes:
6-23
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Unit Summary
Having completed this unit, you should understand: The reasons for using namespaces The syntax used in namespaces The use of default namespaces The interaction between namespaces and attributes Problems with namespaces Best practices regarding namespaces Status of namespace technology
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
7-1
Student Notebook
Unit Objectives
After completing this unit, you should be able to: Understand what an XML Schema represents List and describe the reasons for using XML Schema List the key features of the XML Schema definition language Define the grammar rules of an XML document using the syntax of the XML Schema definition language List and define the best practices to use when using XML Schema Describe the status of XML Schema in the industry
XM3014.1
Notes:
7-2
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
Approach
We will break this unit into three parts. Part 1 (this part) will Introduce the philosophy behind the XML schema definition language Provide an introductory example using the more common constructs Motivate the need for more sophisticated additions Part 2 will Introduce the semantics of the more common constructs
Including the more common options
XM3014.1
Notes:
7-3
Student Notebook
XM3014.1
Notes:
The key points are: An XML schema represents something that was constructed according to specific rules, semantics, and so forth; A schema is itself compliant with the rules governing the construction of a well-formed XML instance; and We can apply all the useful concepts developed to increase the utility of XML against these schema. These XML compliant things are not necessarily persistent, touchable things -- they can be created, produce their effects and then be gone. Like the wind in a wind tunnel: we can create a wind, feel - and sometimes see - its effects, and then turn it off whence it disappears but leaving things in a quite different state. An XML Infoset is an abstract data set describing the information available from an XML document. Its purpose is to provide a consistent set of definitions for use in other specifications that need to refer to the information in a well-formed XML document.
7-4 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
Uempty
7-5
Student Notebook
XM3014.1
Notes:
Business The arrival of XML as a language for data interchange between applications created the need to be able to specify richer semantics for XML documents. Similarly, in order to facilitate loosely bound data interchange between applications, there is a need to be able to combine rich grammars from different organizations in order to facilitate data interchange applications. A well-defined XML vocabulary can improve communication among organizations integrating using XML causing integrations to proceed more smoothly, quickly and at a lower cost. Technical Need a way to validate structure of incoming documents against a detailed specification. Cost of integration can be lowered as certain validations are moved to a validating parser and are not perpetuated into the backend systems.
7-6
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
The non-XML syntax used for DTDs makes it harder to write applications which manipulate grammar data. The symmetry of using XML syntax to represent information about the grammar allows all of the XML technologies to be applied to the grammar information itself. Data typing makes it possible for XML processors to verify more semantic constraints on the contents of XML documents. This will also allow future versions of XML processors to deal directly with typed data. Constraints allow additional validations to occur against the specification. The original XML 1.0 recommendation was followed by a recommendation for a namespace facility. Unfortunately, this meant that namespaces were not directly integrated with DTDs. XML Schema provides a way to reconcile namespaces and grammars. Need a way to represent relationships among data elements similar to what we do with database tables or objects.
7-7
Student Notebook
Only the schema declaration would find only the first example valid.
Copyright IBM Corporation 2004
XM3014.1
Notes:
This document shows how to declare an element that has a simple type. A schema begins with a <schema> tag. This tag also has a namespace declaration that says that the default namespace is the namespace for XML Schema (denoted by the URI http://www.w3.org/2001/XMLSchema). Much more will be presented on this later: this is only an introduction. An element is declared using the <element> tag. The type attribute on the <element> tag specifies the type that is used for the element. nonNegativeInteger is a built in simple type (In this example, the type is nonNegativeInteger, which is one of the built in simple types.) We will cover most of the key aspects of the XSD language in this chapter so do not focus on details at this time.
7-8
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
It is difficult, but not impossible, to master schemas by reading and interpreting the W3C specification. It is much easier, for most of us, to master enough concepts to get started, often from tested examples, and then add complexities as we need them. Most of what you want to implement can be accomplished -- it is usually just a matter of searching the very large body of schema information until you find something that does what you want to do.
7-9
Student Notebook
XM3014.1
Notes:
The information on the charts come from the XML Schema Requirements document which is found at http://www.w3.org/TR/1999/NOTE-xml-schema-req-19990215#Requirements. XML Schema uses XML instance syntax to describe the rules for the grammar of an XML Schema document. XML Schema documents allow a rich variety of data types to be used to constrain both element and attribute content. XML Schema documents allow us to specify which namespace declarations and definitions belong in, importation of declarations and definitions from another namespace, and wildcarding of declarations from other namespaces. In XML Schema, type definitions and element and attribute declarations are separated from each other, allowing reuse of type definitions. XML Schema documents allow us to specify constraints on the uniqueness of values of a particular type, as well as relationships between those unique values and values of other types.
7-10 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
Uempty
XML Schema document and XML Schema are used interchangeably as are simply, schema. Expect to also see XML Schema definition, XML schema definition document, XML Schema description, and also an XSD or xsd. The key idea is simply that an XML instance, in order to be valid, has an associated DTD or schema. The XML may be recognized by its .xml extension, the DTD by its .dtd extension, and the schema by its .xsd extension -- assuming one does not have any of these instances open for inspection.
7-11
Student Notebook
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
7-13
Student Notebook
Declaration Namespace
<xsd:schema
<xsd:element name='elem1'> <xsd:complexType> Complex <xsd:sequence> Element <xsd:element ref='elem2'/> </xsd:sequence> <xsd:attribute name='attr1' type='xsd:string'> Attribute </xsd:complexType> declaration with built-in </xsd:element> </xsd:schema>
Copyright IBM Corporation 2004
'simple' type
XM3014.1
Notes:
This chart identifies the most common ideas we will encounter inside an XML schema. Once more, do not focus on the details at this point. We only wish to present the overall picture in these first few introductory charts.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
The po.xml source file is included in the XM301 Lectures folder inside Unit 7. The vertical bars were added to identify the various scopes. Color and thickness are used to differentiate the three principal levels involved. A more detailed description follows on the next chart.
7-15
Student Notebook
XM3014.1
Notes:
We first saw these definitions in chapter 3 on XML basics. Keep in mind that here we are talking about the XML instance and not the schema that defines it.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
It is common usage to not include the words in brackets. Some of us find the absence of these words leads to confusion. In an effort to avoid this confusion some of us speak in terms of the content of the element: An element that contains subelements and/or attributes is referred to as an element with complex content; an element that contains only one of the predefined XSD types is referred to as an element with simple content. An attribute is then referred to as a simple type, which is still shorthand that really refers to the value that the attribute carries. Although it is a challenge, try not to get too wrapped up in the details of the nomenclature; instead, focus on the patterns that are involved. The po.xml file in the XM301 Lectures project file has the necessary information to form the association to its associated schema, po.xsd so that when you select po.xml Studio will report it as valid.
7-17
Student Notebook
XM3014.1
Notes:
There is considerably more to the use of simple types than is implied by their name. We shall examine the specifics of this statement later on as we parse the .xsd that a tool like WebSphere Studio can automatically generate from an XML instance. Appendix B to this course has a snippet from the actual table found in the 20010502 version of the Primer. Again, recognize that any/all URLs are subject to change. Always check the basic W3C Web site for the latest version.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
Well-formed XML is governed by rules; XSDs are governed by rules. So it should be possible to create software that is aware of the rules that govern an XSD such that we can infer a schema that would validate an XML instance. This is not new: we saw this same capability with DTDs: given a well-formed XML it was possible to use a tool such as Studio that knows the DTD rules to generate a DTD file that would validate the XML source on which it was based. We also saw that the generated DTD only reflected what it could understand: we had to edit the DTD file to capture all that we knew about the XML source. This example carries on over a dozen or more pages. Again, focus on patterns not details -- they will come later. Try to keep the big picture in mind to avoid getting lost in the details.
7-19
Student Notebook
XM3014.1
Notes:
Since an XSD conforms to XML of course best practices dictate we should begin with an XML declaration. Since we built the shell in Studio it automatically included the UTF-8 value for the encoding attribute. "[A] different prefix could be used. . ." There is an example within Studio in a "help" file where simply 'x' is used as a prefix: there is no normative statement that "xsd" must be used as the prefix. Out of kindness to our peers, it is a good idea to stick with "xsd" since it is easily recognized. The key aspect is that the purpose of the association of the prefix with the URI stated on the chart is to identify the elements and simple types as belonging to the vocabulary of the XML Schema language and not to the vocabulary of the schema author. If the latter were the case, the specification of what it means to be a decimal (for example) could differ from what the XML Schema language defines it to be in its specification. Henceforth we will refer simply to a complex type or a simple type element and leave it to you to do the translation: "element of complex type. . ." and so forth.
V3.1.0.1
Student Notebook
Uempty
The keyword schema is very special; see the XML Schema specification or the W3C associated Primer, which has hot links to the appropriate part(s) of the specification. This snippet is directly from the 5/2/2001 Schema specification Part 1. One pair of parentheses is larger than the others to help you keep track because of the nesting. Consider the snippet in the last bullet to be but a peek into the depth a simple specification can be taken!
7-21
Student Notebook
XM3014.1
Notes:
Refer to the previous notes for direction to additional information. Part of the difficulty in mastering XML schemas is knowing when something is special as opposed to looking like something that should be special. Fortunately, we can rely on Studio to provide quite a lot of syntactic guidance.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
The order in which definitions occur in the schema for the .xml instance is generally not bound by the order in which the elements occur in the instance itself. One can only surmise that Studio only looks at the first occurrence of a type and picks the most general candidate for the type inasmuch as USPrice occurs twice and both times its value is clearly of type decimal. String clearly includes any kind of numeric type. A tool like Studio can make our XSD construction much faster and easier but it does not do all of our work. Hence, the admonition that we have to edit the machine output to comply with our knowledge of the problem together with the knowledge of our domain/subject matter experts. As a last step we will comply with the three subbullets of the last bullet.
7-23
Student Notebook
There are four scopes involved here 1. The element whose name is billTo (associating it to the .xml file) 2. The complexType declaration element . . .see next chart. . .
Copyright IBM Corporation 2004
XM3014.1
Notes:
As you can see the level of complexity is rapidly increasing. As we proceed through this example always keep in mind that we have an XML instance with its own simple/complex/attribute structure and now, in the associated XML Schema, we have another XML-like instance with its own simple/complex/attribute structure plus (as you will see) a growing collection of baggage to be employed to bring precision to what could be myriad instances of the XML structure of which po.xml is but one instantiation.
V3.1.0.1
Student Notebook
Uempty
The declarations are not themselves types they represent an association between
A name and Constraints that govern the appearance of that name in the associated .xml <xsd:element ref="street2"/> , for example, could have been written as <xsd:element name="street2" type="xsd:string"/>
The sequence element; this comes under the category of Model Group Schema Component in the specification
For our purposes we need three choices:
Sequence that requires the subelements (which could themselves be complex) must all appear and must do so in the order they are listed as subelements of the sequence element; All that requires all the subelements must appear but in no particular order; Choice that requires one of the subelements must appear. . . .see next chart . .
Copyright IBM Corporation 2004
XM3014.1
Notes:
7-25
Student Notebook
Studio chose to use a more general construct that uses a reference rather than an inline definition for the elements
That choice requires another statement to associate some built-in type with the ref= name, which here is either name, street, street2, city, state, or zip On the other hand if we wanted to create some lengthy/complicated value, we would only have to do it once and then we could ref= it Notice that the rest of the declarations come immediately before the end of the schema </xsd.schema>.
XM3014.1
Notes:
Similar enough to be confusing, that is. The po.xsd file represents but one of many possible schemas that would find our po.xml file valid. Refer to the Primer associated XMLSchema 1.0 for many other possibilities. The alternatives, while interesting, would require at least one full week to examine. We should reorder Studio's output into something that is more intuitive to us. We should also change the types if we have better knowledge of what is required. We will perform both of these functions before we declare we are finished with the po.xsd file. Last bullet: note the indentation level (assuming your schema is properly formatted) of the element that defines any ref=: it is a child of the root element, schema or xsd:schema.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
Gotcha!?
7-27
Student Notebook
XM3014.1
Notes:
The first line here is the last line from several pages ago.
V3.1.0.1
Student Notebook
Uempty
The first occurrence also contains a simple type element comment The second occurrence instead contains a simple type element shipDate All these elements use ref= so expect to see definitions Now look at the schema definitions on the preceding page: Studio again uses a sequence construct However, when Studio digested the comment-or-shipDate elements it chose to model this idea as a choice element
Copyright IBM Corporation 2004
XM3014.1
Notes:
Recall, we can have sequence, choice, or all. We will summarize these choices after we complete our walkthrough of the purchase order example.
7-29
Student Notebook
XM3014.1
Notes:
Later we will also present the various alternatives to "optional."
V3.1.0.1
Student Notebook
Uempty
Studio is guessing that there may be any number of occurrences of item (that is, unbounded) and it is guessing that there will always be at least one. ref="item" associates this declaration with the specification of item directly above this specification of items We are beginning to see that there are facets that can be applied to elements. The facets available are context sensitive.
Copyright IBM Corporation 2004
XM3014.1
Notes:
The specification of an element is really quite complicated. Unless you wish to become the expert on XML schema, you need only master a "reasonable" subset of all the possible choices. We will add detail later.
7-31
Student Notebook
XM3014.1
Notes:
The first element is the last element from several pages ago. It wouldn't make sense to just copy the last line because we would lack the context it is in. By now you have noticed that element is used quite often in creating a schema. Hence the need to provide enough lines to be able to identify which element in the XML instance we mean. Notice, too, the relationship of these elements to the main element <xsd:schema...>
V3.1.0.1
Student Notebook
Uempty
We also expect to see the attribute orderDate items, as we have seen, had its own definition. The remainder of the schema provide the definitions of elements referred to previously The last two lines define the elements
quantity shipDate
Copyright IBM Corporation 2004
XM3014.1
Notes:
We still need to validate the type choices. We will also want to arrange the definitions in an order more meaningful to us.
7-33
Student Notebook
XM3014.1
Notes:
Again, the first line here is the last line from several pages ago.
V3.1.0.1
Student Notebook
Uempty
Name and city were previously defined The remaining elements provide definitions/declarations for the remaining referenced elements Studio uses string for all these definitions Clearly, there is room for our inputs and those of our domain experts. The schema file poBetter.xsd in Unit 7 of XM301 Lectures contains our views of how it should be organized and the simple element types defined All the references are organized at the top of the file. What we see are the benefits of our knowing the big picture.
Copyright IBM Corporation 2004
XM3014.1
Notes:
Another way of looking at is we have the benefit of seeing the instance document in its totality: we can apply global optimization. If this were a huge instance document we could also only be looking at a small part of the total picture. In that case, a CASE tool might be able to do better than we can. Except we may have access to the thought process / specification used to create this document.
7-35
Student Notebook
Figure 7-31. A Simple XML Document - Connecting the Schema to the Instance
XM3014.1
Notes:
You will find this process necessary should you wish to test your own files before we have a chance to describe the theory behind the assignment process. Studio -> Help Search "assign xml schema" will take you to the topic "Assigning an XML file to an XML schema." You will find complete instructions there; the key is to follow the directions literally: our XML file did not use a "namespace" so there is also no "prefix." It is easy to mistake the xsd prefix in the po.xsd file for the "Prefix" requested by the pop-up window. That is incorrect: the "Namespace" and "Prefix" described in the pop-up window refer to the po.xml file. Most of the time, your .xml file will use a namespace and a prefix; in our effort to created a "simple" example, we may have made it inadvertently complex! The additions - inside the .xml file are: xsi:noNamespaceSchemaLocation="po.xsd" and xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" into what was <purchaseOrder orderDate="2003-10-20">
7-36 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
Uempty
The vertical line in the "Namespace Name:" fill-in is the cursor that was visible when the screenshot was made.
7-37
Student Notebook
What's Next?
That's it for an introduction to XML Schema via a simple example There is still a lot of additional specificity to be added to the poBetter.xsd file before we would want to use it in the real world; for examples
How many street/street2 entries should there be? Can we define some expression to force telephone numbers to have an area code, an exchange, and a four digit number? . . .others???
Our next step is to introduce the syntax and nomenclature for the majority of constructs you will find useful in actually creating and using schemas in most common situations Again, for emphasis, the W3C Web site is the only normative source There are two normative parts: http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/ And one non-normative primer: http://www.w3.org/TR/2001/REC-xmlschema-0-20010502/
Copyright IBM Corporation 2004
XM3014.1
Notes:
There are many ways of writing a valid schema. The primer on the W3C website provides a considerably different schema from the one we have shown you. The XSD language is as complex as any spoken or written language. ...and like any other language, it is not necessary to know everything about a language in order to use it to communicate. Most of the more advanced concepts will become part of your vocabulary through necessity and practice. Normative implies compliance is required; it defines usage, for example. Non-normative implies that, while the usage is correct not all possibilities are covered, for example. There is also the URL for the requirements we quoted at the beginning of this part.
V3.1.0.1
Student Notebook
Uempty
. . .but first. . .
Break!
XM3014.1
Notes:
7-39
Student Notebook
XM3014.1
Notes:
This chart repeats some information stated at the beginning of the unit. Again we caution you against expecting detailed treatment of these topics: Do not confuse common with basic: some quite advanced concepts are part of many constructs. The XSD language is extensive as is the English language; even though you may know the difference between an a positive and a noun you must realize the average person does not. Similarly, it is not necessary to master every nuance of the XSD language in order to create useful XSD documents. In creating these notes we make every effort to use the language of the W3C specification. Studio can be of inestimable value to us on our journey to mastering XSD documents as you shall see both in the lecture and the accompanying lab. So relax, and let us proceed.
7-40 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
We could spend hours trying different combinations in class. This is one of those instances where you are encouraged to explore on your own. Note, too, that some very complex concepts are part of many of the choices Studio suggests. We recommend you examine the examples provided here to see which constructs we typically need.
7-41
Student Notebook
Simple types
Do not allow elements May not carry attributes
XM3014.1
Notes:
It's not as confusing as it sounds. Definitions allow us to define our own types that may apply to our unique situations. For example, things that differ from country-to-country such as phone number and postal codes: we can define a USPostalCode that demands ddddd-dddd, where d = digit and we could even make the last four digits optional; and we could define a CNPostalCode that would apply only to postal codes consistent with Canadian postal codes. Declarations, on the other hand, define what is legal in the document instances; as you can guess, definitions play a key role in defining the declarations to apply!
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
We will use this structure to introduce the details of the 13 kinds of schema component. This will be slightly confusing because different perspectives are best dealt with using different organizations. This is a classic example of the difficulty with a language: there is no beginning and no end -- only middle!
7-43
Student Notebook
Beware: schema component may be treated like a reserved word in that it may refer to an element in a .xsd file that is a child of schema used as an element (specifically, what we have previously called the root element). In other contexts it may refer to any of the things that can appear in a schema document.
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Resetting Expectations
The XML Schema Specification represents several hundred pages Every one is important for some situation Not every one will be encountered in routine operations This is an "Overview" course We will describe common usages here Additional, less common constructs are in the Appendix Refer to the three-part official specification for details. The chart that follows is an example Only the 1st of the 3 charts is presented in this unit The remaining two, specification level charts have been moved to the Appendix
XM3014.1
Notes:
7-45
Student Notebook
XM3014.1
Notes:
For additional information: http://www.w3.org/TR/xmlschema-1/#Simple_Type_Definitions
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
This foil shows all the built-in datatypes in XML Schema. They are divided into two categories: primitive types, and types which are derived from those primitive types Let's look at the primitive types first. The blue types are the types required for the XML 1.0 specification. Numeric types are shown in red, and time/date types are shown in green. In the first row we have string, and three numeric types. Decimal is an arbitrary precision floating point number. Float and Double correspond to the IEEE floating point types of the same name. In the next row there is a Boolean type, and the timeDuration and recurringDuration types which form the base for all time and date related types. The third row covers types from XML 1.0 as well as the QNAME type for the XML Namespaces recommendation. In the fourth row, we see uriReference, which corresponds to URI's and binary, which corresponds to encoded binary data.
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
7-47
Student Notebook
Now lets look at the derived types. The first two rows of the table show types from XML1.0 and the Namespaces recommendation. In the third row, tokens are tokenized string, and language must be language identifier string from the XML 1.0 (Second Edition) Recommendation. The next four rows include a variety of convenient numeric types that are restrictions of one of the primitive numeric types. Likewise, the next three rows are convenient date and time types which are restrictions of the primitive date/time types.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
New simple types can be created by constraining an existing simple type.
7-49
Student Notebook
Facets
Facets characterize properties of a simpleTypes' value space or lexical space. Value Space - abstract set of values in the type Lexical Space - concrete set of literals that you can write down Fundamental Facets Equal, ordered, bounded, cardinality, numeric Constraining (non-fundamental) Facets Length, minLength, maxLength Pattern, enumeration WhiteSpace maxInclusive, maxExclusive, minInclusive, minExclusive totalDigits, fractionDigits
XM3014.1
Notes:
Facets characterize properties of a simple type's value space or lexical space. The value space of a simple type contains the values that the type represents (the set of rationals between 0 and 1 is a value space. Simple types have a lexical space, which is a set of literals that comprise the written/printed representation of the type (an example of a lexical space is base 16). Facets are like attributes on a type. They always exist for a type but may not have been set to any specific value. The facets, which can be set as part of the definition of a type, are listed below. Fundamental Facets (these facets are abstract and are there for completeness in modeling datatypes. Their values cannot be changed). equal, ordered, bounded, cardinality, numeric Constraining Facets (you can specify values for these facets to constrain an existing type).
V3.1.0.1
Student Notebook
Uempty
length, minLength, maxLength (these apply to types where the notion of length is meaningful, such as string and binary) pattern, enumeration (these apply to all data types - pattern constrains the lexical space). whiteSpace (controls whitespace processing behavior, is only changeable for string ). maxInclusive, maxExclusive, minInclusive, minExclusive (applies to all the numeric and date/time types) totalDigits, fractionDigits (apply to decimal and it's derived types - total digits specifies total number of digits, fractionDigits specifies number of digits in the fractional part of a decimal (derived) type.)
7-51
Student Notebook
The new type quantityType is now used to define the type for the element quantity
XM3014.1
Notes:
In this schema, we show <simpleType> tag in addition to an <element> tag. This tag defines a new simple type called 'quantityType'. 'QuantityType' is a restriction of the built in type 'integer'. Integer is the value of the base attribute on the <restriction> sub tag of the <simpleType> tag. The base attribute specifies the base type for the restriction. Inside the <restriction> tag is the tag <minInclusive>. This means that 'quantityType' is giving a new value for the minInclusive facet of 'integer'. 'QuantityType' can have a minimum value of 0, up to the maximum value for integers. The simple type being restricted can either be a built in type or a previously defined simple type. The <element> tag declares a new element called 'quantity' whose content must match the rules for 'quantityType'.
V3.1.0.1
Student Notebook
Uempty
Valid XML:
<color>red</color>
Invalid XML:
<color>mauve</color> <color>10</color>
XM3014.1
Notes:
All simple types have the enumeration facet. The enumeration facet allows the definer of a simple type to write out exactly which values of the base type are allowed in the restriction. In this example, the restriction has enumerated a set of color names. The value attribute of the enumeration tag specifies the value to be included in the enumeration.
7-53
Student Notebook
XM3014.1
Notes:
A complex type needs to indicate what kind of content it is going to contain. This description is called a content model. The content model only describes the content of the complexType, but the complex type may include other information such as attribute definitions. There are four kinds of content models available in XML Schema. Here are the four types, along with the method for specifying them in a complexType definition. The Simple Content model contains typed character data. It is always used by extending some base simpleType. The empty content model means that no content at all is allowed. Usually elements with an empty content model carry their data as attributes.
V3.1.0.1
Student Notebook
Uempty
simpleContent Example
Declaration:
<xsd:element name="quantity"> <xsd:complexType> <xsd:simpleContent> <xsd:extension base="xsd:nonNegativeInteger"> <xsd:attribute name="backorderable" type="xsd:boolean" default="false"/> </xsd:extension> </xsd:simpleContent> </xsd:complexType> </xsd:element>
Valid XML fragment: <quantity backorderable="true">1</quantity> Invalid XML fragments: <quantity orderable="true">2</quantity> The extension is the addition of the attribute.
Copyright IBM Corporation 2004
XM3014.1
Notes:
This visual shows the definition of a complex type. It only uses the most basic features of complex types. The schema declares an element called quantity. The definition of the complex type for quantity is embedded in the declaration of quantity. In this case, the complex type has the simpleContent content model, which says that the complex type only allows character data to be present in the element content. That simpleContent is based on the simpleType 'nonNegativeInteger', and is going to be extended (we are going to add to the type -- either a facet or attribute). We could also have a restriction (remove something). The simpleContent content model is extending 'nonNegativeInteger' by adding an attribute named 'backorderable' to the content model. There is no other extension being performed in this example. In the coming visuals we are going to introduce some concepts that we need in order to show more complicated complex type definitions.
7-55
Student Notebook
XM3014.1
Notes:
For additional information: http://www.w3.org/TR/xmlschema-1/#Complex_Type_Definitions As you can perceive, complex type definitions will provide a primary means of controlling the content of XML instances.
V3.1.0.1
Student Notebook
Uempty
The example above includes a name (PurchaseOrderType); it is also possible to define an anonymous complexType by dropping the "name='PurchaseOrderType'" and including the result as a child of an element:
<xsd:element name="PurchaseOrderType"> ... </xsd:element>
XM3014.1
Notes:
7-57
Student Notebook
XM3014.1
Notes:
We're now ready to officially look at element declarations. There are two declarations on this foil. Element declaration with simple type that occurs exactly once. The first declaration is for an element called quantity whose type is the simple type nonNegativeInteger and which is allowed to appear exactly once. We'll talk more about the minOccurs and maxOccurs attributes in a few minutes. Element declaration with local simpleType definition. The second declaration is for an element called quantity whose type is a local simple type with the names of three colors as its values.
V3.1.0.1
Student Notebook
Uempty
Named types ARE reusable in other parts of the Schema, provided that they are declared as direct children of the schema element.
XM3014.1
Notes:
We're now ready to officially look at element declarations. There are two declarations on this foil. Element declaration with simple type that occurs exactly once. The first declaration is for an element called quantity whose type is the simple type nonNegativeInteger and which is allowed to appear exactly once. We'll talk more about the minOccurs and maxOccurs attributes in a few minutes. Element declaration with local simpleType definition. The second declaration is for an element called quantity whose type is a local simple type with the names of three colors as its values.
7-59
Student Notebook
There are three different compositors: xsd:sequence: the elements may occur only in the order specified according to their min/maxOccurs values xsd:choice: only one of the elements declared may occur but its min/maxOccurs values dictate how many times it may occur (very hard to do in a DTD) xsd:all: the elements must all occur (in accordance with their min/maxOccurs values) but order doesn't matter (also very hard to do in DTD) Copyright IBM Corporation 2004
Figure 7-52. Declaring Child Elements in complexType Elements XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
minOccurs and maxOccurs are used to indicate the number of elements required/permitted. Note: The typename must be namespace qualified. If the type is not associated with the namespace of the schema element it must be qualified with a namespace prefix representing the proper namespace.
Copyright IBM Corporation 2004
XM3014.1
Notes:
The declaration is for an element called quantity whose type is the simple type nonNegativeInteger and which is allowed to appear exactly once. We'll talk more about the minOccurs and maxOccurs attributes in a few minutes. If we haven't already so stated, the xsd: prefix is what tells the processor where to look for validation information. Tying it to the W3C 2001 XML Schema makes that specification normative for the purposes of validation. Remember, the schema element is the root or main element.
7-61
Student Notebook
Components that can have a minOccurs or maxOccurs attribute are: Elements. Groups. xsd:all, xsd:sequence, and xsd:choice compositors. Wildcards. Both values default to 1. It is an error if minOccurs is greater that maxOccurs, that is, maxOccurs must always be greater than or equal to minOccurs. maxOccurs may have the non-numeric value unbounded (an infinite number).
Copyright IBM Corporation 2004
XM3014.1
Notes:
XML Schema provides the minOccurs and maxOccurs attributes as a way of controlling the number of times that a particular component may appear. The minOccurs attribute specifies the minimum number of times that a component may occur. If the value of minOccurs is greater than zero, then the component is required. If the value of minOccurs is zero, then the component is optional. The default value for minOccurs is 1. The maxOccurs attribute specifies the maximum number of times that a component may occur. Their special value 'unbounded' means that the component may appear an unlimited number of times. This corresponds to the use of the asterisk in DTD's. The default value for maxOccurs is 1. minOccurs and maxOccurs apply only to the component that has the minOccurs or maxOccurs attribute.
V3.1.0.1
Student Notebook
Uempty
Valid XML:
<DNASample> <sample>GATCTATC</sample> <sample>ATAAACG</sample> </DNASample>
Invalid XML:
<DNASample> <sample>ATGCAAT</sample> </DNASample>
XM3014.1
Notes:
Here we define a version of DNASample that can take between 2 and 500 samples. We use minOccurs and maxOccurs to enforce this rule.
7-63
Student Notebook
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Declaring Attributes
Attributes are declared as children of a complexType element.
<xsd:element name=...>
Some useful (optional) attributes used in an xsd:attribute declaration are: Use: values can be required, prohibited and optional, optional is the default Default: provides a default value for the attribute when it is absent Fixed: fixes the attribute value to the value specified Some rules: The attribute type must be a simpleType (that is, non-element) Fixed and default may not be present together on xsd:attribute. Use attribute must be "optional" or absent when a default is provided.
XM3014.1
Notes:
Attributes are declared by placing an <attribute> element inside of the <complexType> element. The name attribute specifies the name of the attribute, the type attribute specifies its type. The use attribute tells whether the attribute is 'optional' or 'required'. The default attribute tells what the default value will be if the attribute is omitted in the XML document. Here are examples of common attribute declarations. Optional attribute with default. The first declaration shows how to declare an optional attribute whose value must be an integer. If the attribute does not appear in the instance document, it will be given a default value of 10.
7-65
Student Notebook
Valid XML: <blank temperature='32.0'/> (preferred) <blank/> <!-- temperature='32.0' --> Invalid XML: <blank temperature='34.0'/>
XM3014.1
Notes:
Optional fixed attribute (using default value for use). The fixed attribute specifies the fixed value that the attribute must have if a value is specified by a document. This declaration shows how to declare an optional attribute whose value must be an integer. If the attribute value does appear in the instance document, it must have the value 32.0.
V3.1.0.1
Student Notebook
Uempty
<xsd:element name="person"> <xsd:complexType> <xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="id" type="xsd:integer"/> </xsd:sequence> <xsd:attribute name="criminal" type="xsd:boolean" default="false"/> </xsd:complexType> </xsd:element> "boolean" can be true/false (case sensitive) or 1/0.
XM3014.1
Notes:
In this example we declare a complex Type called personType. personType has the Element only content model, as shown by the use of the <sequence> compositor inside the <complexType> element. We also declare a single attribute named 'criminal' as part of personType. Now we can declare an element named 'person' that will have the type 'personType'.
7-67
Student Notebook
XM3014.1
Notes:
The first invalid example is invalid because <id> and <name> are in the wrong order. The second invalid example is invalid because there's no attribute called 'friend'.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
7-69
Student Notebook
XM3014.1
Notes:
Required attribute with local type definition. There is no type attribute for this <attribute> element because the simple type definition is embedded in the attribute declaration. This declaration shows how to declare a required attribute. This declaration also includes an embedded simple type definition that specifies a restriction of the integers to integers greater than or equal to zero.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
7-71
Student Notebook
Attribute Groups
If a group of attributes are used together often, an attribute group can be created to formalize the relationship and avoid the need to declare the same attributes in several places.
<xsd:attributeGroup name="addressInfo"> <xsd:attribute name="street" type="xsd:string" use="required"/> <xsd:attribute name="city" type="xsd:string" use="required"/> <xsd:attribute name="state" type="xsd:string" use="required"> <xsd:attribute name="zip" type="xsd:string" use="required"> </xsd:attribute>
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
7-73
Student Notebook
A minimal model group is defined and used by reference, first as the whole content model, then as one alternative in a choice.
Copyright IBM Corporation 2004
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
7-75
Student Notebook
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
3.1 Annotations
Annotations provide for human- and machine-targeted annotations of schema components. Example. XML representations of three kinds of annotation:
<xsd:simpleType fn:note="special"> <xsd:annotation> <xsd:documentation>A type for experts only</xsd:documentation> <xsd:appinfo> <fn:specialHandling>checkForPrimes</fn:specialHandling> </xsd:appinfo> </xsd:annotation> ... </xsd:simpleType>
Using the wizard in Studio can make using annotation much more user-friendly.
XM3014.1
Notes:
7-77
Student Notebook
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
7-79
Student Notebook
XM3014.1
Notes:
Model groups allow you to group a set of element declarations together and use them. In multiple places. There are three kinds of model groups which are distinguished be a compositor element, as illustrated on this foil. Model Group with sequence compositor The most straightforward kind of model group uses the sequence compositor, which simply states that all the elements must appear in the same order as the element declarations in the model group. Model Group with choice compositor Also straightforward is the choice compositor, which states that any element matching one of the element declarations in the group may appear in the instance document. The sequence and choice model groups appear in XML 1.0 DTDs.
V3.1.0.1
Student Notebook
Uempty
Model Group with all compositor New in XML Schema is the all compositor, which says that all the elements specified in the group must appear in the instance document, but the elements are allowed to appear in any order. The value of compositor was expressed earlier as varieties. The term occurs in Part 1 of the 2001 XML Specification.
7-81
Student Notebook
XM3014.1
Notes:
Model groups allow you to group a set of element declarations together and use them. In multiple places. There are three kinds of model groups which are distinguished be a compositor element, as illustrated on this foil. Model Group with sequence compositor The most straightforward kind of model group uses the sequence compositor, which simply states that all the elements must appear in the same order as the element declarations in the model group. Model Group with choice compositor Also straightforward is the choice compositor, which states that any element matching one of the element declarations in the group may appear in the instance document. The sequence and choice model groups appear in XML 1.0 DTDs.
V3.1.0.1
Student Notebook
Uempty
Model Group with all compositor New in XML Schema is the all compositor, which says that all the elements specified in the group must appear in the instance document, but the elements are allowed to appear in any order.
7-83
Student Notebook
XM3014.1
Notes:
Element Declaration that uses a global complexType, which itself uses a global model group. On this foil we are building up a bunch of components to be used by the element declaration at the bottom of the foil. First we define a model group called 'vitals', which contains 2 element declarations. This group uses a sequence compositor. Next we define the complexType called 'personType', which references the model group 'vitals' to pick up it's content type. It also adds an optional Boolean attribute. Finally, we declare the element 'person' and have it reference the global complexType 'personType' that we declared earlier.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
7-85
Student Notebook
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
. . .but first. . .
Break!
XM3014.1
Notes:
7-87
Student Notebook
The first four attributes + targetNamespace are a few of the 34 possible attributes reserved specifically for the XML Schema element itself. Note that both the attributeFormDefault and the elementFormDefault are unqualified by default.
Copyright IBM Corporation 2004
XM3014.1
Notes:
The impact of the ...Default(s) being unqualified is that we do not have to include prefixes on elements and/or attributes. That is, unless there is some reason so to do.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
7-89
Student Notebook
Now the <quantity> element will be "in" the namespace with the URI: http://www.ibm.com/Schemas/WD03/target The value of the default namespace, xmlns, and the targetNamespace are the same.
XM3014.1
Notes:
The first piece of XML Schema's namespace support allows us to declare that a set of schema components is associated with a particular namespace. targetNamespace attribute
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
7-91
Student Notebook
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
where schemaLocation is a legal URI To reference the schema on the previous slide use:
Whitespace
XM3014.1
Notes:
SchemaLocation attribute The XML Schema recommendation does not provide a definitive mechanism for an XML processor to use to locate the schema components associated with a namespace. Instead, it provides the schemaLocation attribute in the XML Schema Instance Namespace. We'll cover the XML Schema Instance namespace in greater detail in a few moments. The schemaLocation attribute is a list of pairs of URIs. The first URI in each pair is a namespace URI. The second URI in each pair is a URI that can be resolved to find the schema for the namespace URI specified by the first element of the pair. Thus, the schemaLocation attribute can be used to specify the locations of the definitions for multiple namespaces in a single location. The schemaLocation attribute may appear on any element in the instance document, as long as it appears before any element in the namespaces that it is providing location hints for. One more important thing to note is that the schemaLocation attribute provides a hint. A particular schema processor is permitted to ignore the schemaLocation hints, or provide
7-93
Student Notebook
its own method for location the schema components associated with a particular namespace. Lastly, there is a variant of schemaLocation, called noNamespaceSchemaLocation, which should be used when a set of schema components is not associated with a target namespace. This version does not use a a list of pairs, since there is no namespace involved. It simply accepts a URI that can be used to locate the schema. We saw an example back in Part I. The example on this foil shows a hint that says the scheme components for the namespace http://www.ibm.com/Schemas/WD03/target can be found by resolving and processing the URI http://www.ibm.colm/Schemas/WD03/target.xsd.
V3.1.0.1
Student Notebook
Uempty
Best Practices (1 of 2)
complexTypes versus elements Use types if you need to reuse both model and attribute group combinations. complexTypes versus model groups Use model groups if you are only reusing content model fragments. complexTypes versus attribute groups Use attribute groups if you are only reusing sets of attribute definitions. Local versus global types Use global types if you need reuse, local types if you need to keep things nicely scoped.
XM3014.1
Notes:
complexTypes versus elements Use types if you need to reuse model and attribute group combinations. Recall that a complex type has both a content model and a set of attributes associated with it. complexTypes versus model groups Use model groups if you just need to reuse content model fragments Here we're following the principle of using the least powerful tool for getting the job done. This will also have performance implications because there is less processing during the schema validation. complexTypes versus attribute groups Use attribute groups if you just need to reuse sets of attribute definitions. The reasoning here is similar to the above. local versus global types
7-95
Student Notebook
Use global types if you need reuse, local types if you need to keep things nicely scoped. Using global types is a requirement if you want to reuse the type across element declarations. If you want to keep related components under tighter control, then you should use local types, as that restricts the effect of your definitions.
V3.1.0.1
Student Notebook
Uempty
Best Practices (2 of 2)
Namespaces Always put schemas in a namespace. Import versus wildcards Use import when you want to use the imported types in your schema. Use wildcards when you just want to allow elements to appear.
XM3014.1
Notes:
Namespaces Always put schemas in a namespace. There's no good reason not to put your schema in a namespace, and if you ever end up interacting with another company/entity, you're going to want to have your schema in its own namespace. This also lets you place your declarations in another file. Import versus wildcards Use import when you want to use the imported types in your schema. Although we didn't cover it, you can use XML schema's inheritance features to derive new types from imported types. Use wildcards when you just want to allow elements to appear. Wildcards have a different effect because they allow you to control individual elements or subtrees of the document Explain the why or benefits
7-97
Student Notebook
References
Resource
http://www.w3c.org/XML/Schema
Description
Information on tools, status of the spec., links to useful info
http://www.w3.org/TR/xmlschema-0
http://www.xfront.com/xml-schema.html Schema tutorial www.alphaworks.ibm.com Visual DTD - includes support for XML Schema
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Unit Summary
Having completed this unit, you should understand: The reasons for using XML Schema The important features of XMLSchema How to define the grammar rules for a document in XML Schema Best practices for using XML Schema Status of XML Schema at W3C
XM3014.1
Notes:
In this section, you have been exposed to: The basic functionality of XML Schema Simple type definitions Complex type definitions Attribute definitions Model Group definitions Attribute Group Definitions Element declarations XML Schema namespace functionality Import any Best practices for using XML Schema
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
7-99
Student Notebook
Status of XML Schema at W3C Current state of tools for working with XML Schema
V3.1.0.1
Student Notebook
Uempty
8-1
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Unit Objectives
After completing this unit, you should be able to: Describe the reasons for using XPath Define the components and constructs that make up the XML Path Language Write simple XPath expressions Identify abbreviated XPath expressions Describe how to partition the XPath document Define the current status of XPath in industry
XM3014.1
Notes:
8-2
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
What Is XPath?
A specification for querying an XML document. Does not have an implementation independent of other standards/technologies. Used by XSLT, XPointer, and other emerging technologies, such as XQuery. Often when processing XML, we need to address (locate) a portion of or elements of the document which meet specified criteria. Example: In XML for a book on Java, find the chapters with JDBC in the title. Provides the ability to address any slice of an XML document in any direction. Either forwards, backwards or sideways. W3C Recommendation (Nov. 16, 1999).
XM3014.1
Notes:
XPath was defined during the development of XSLT (XML Stylesheet Language Transformation) and XPointer. It was designed to provide unambiguous traversal of XML documents. XPointer and XSLT use XPath's functionality, XSLT uses only a subset of XPath; XPointer uses additional syntax mechanisms to extend its functionality. XPointer allows forward and backward addressing to specific XML locations internal to a document and to locations in external XML documents. Think of this as a super-enhanced version of HTML's HREF linking. XQuery is an emerging technology that will provide standardized access to RDBMS data stores using XML.
8-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Paths are a natural way to express a hierarchical structure. DOS and Windows actually use a backslash to represent the path separators. URI's, XPath, and most other path addressing schemes use a forward slash, as backslash is used to escaped special characters. Example: '' represents a TAB character
8-4
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
ROOT <book>
<author>
<title>
"Tom Wolfe"
"$6.00"
address = "/book/price/text()"
XM3014.1
Notes:
This example shows a typical XML document and how it is represented as a tree of nodes. This conceptual depiction of XML is important to understand. As you can see in the tree diagram, there is a single root node, that contains several other types of nodes. There are a total of seven node types in XML. They are: root nodes element nodes text nodes attribute nodes namespace nodes processing instruction nodes comment nodes
8-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Think of the XPath Express as a series of steps through the XML tree. Each step is a rung in the ladder, or layer of the tree. Wildcards permit a single step to represent many layers, much like skipping several rungs when climbing down the ladder.
8-6
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
.../Ancestor
/Parent
preceding-sibling
following-sibling
Note: Self is always a single node. It can only have one parent and one root. It may have multiple children, ancestors and so forth.
Copyright IBM Corporation 2004
/Child
/Descendant/...
XM3014.1
Notes:
The current context is simply a "you are here" designation within a complete XPath address. As an XPath expression is evaluated the current context is likely shifting. Relative paths don't make sense as a stand-alone entity. They must be combined in some other context based from the documents root.
8-7
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
An XPath location path is made up of one or more steps separated by a forward slash ("/"). Each step within the path consists of: Axis: Branch of the node tree relative to the current context node. NodeTest: Tests node for inclusion. Predicate: Optional filter of matched nodes. Example: Locate all chapters titles in the book that contain the string 'XPath'
/book/child::chapter/child::title[contains(text(),'XPath')]/
XM3014.1
Notes:
XPath uses a path notation similar to URLs. Location paths are specified using a forward slash ("/") separated list of steps. XPath provides a simple method to traverse an XML tree structure, and to select a slice of information in any direction as defined by the Axis. Paths starting with a forward slash are absolute paths from the root downward through the document tree; paths not beginning with a slash are relative to the current (context) node of the node list.
8-8
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
The XPath statement can be expressed with the full syntax, or can be abbreviated. Many Axes have an abbreviated syntax. "Child::" is the most common, and therefore has an empty abbreviation. All axes and abbreviations are discussed later. Absolute path is addressed based from the document's root. /child::catalog/child::tools - The full syntactic expression that returns all tools element children of the catalog element that appear under the document's root; short form - /catalog/tools Relative path is based on the Current Context of the addressing path. child::tools/child::saw - The full expression of a path relative to the context node; short form - tools/saw.
8-9
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
root
2
section title
Notes:
Here are the results of running Studio's XPath Expression Wizard using the expressions shown against the #document root. 1. produced: <title>Sect.1.2 Title</title> 2. produced: <title>Ch. 1 Title</title> <title>Ch. 2 Title</title> 3. produced <title>Ch. 1 Title</title> <title>Ch. 2 Title</title> <title>App.A.1 Title</title>
V3.1.0.1
Student Notebook
Uempty
root
2
title chapter title title section section title section title title
paper chapter section title section @status title appendix title section title section title
"Sect.1.1 Title"
1
Copyright IBM Corporation 2004
XM3014.1
Notes:
1. produces: <title>Sect.1.2 Title</title> <title>Sect.2.2 Title</title> <title>App.A.1.1 Title</title> 2. produces: <title>Sect.1.1 Title</title> <title>Sect.2.1 Title</title> 3. produces: <title>Sect.1.1 Title</title> If you replace /title in 3 with /text() the answer is "Section 1.1".
8-11
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
How do we know we really moved to /paper/chapter[2]/section[1] ? Add text() as a node test to produce: "Section 2.1" when we execute.
Copyright IBM Corporation 2004
XM3014.1
Notes:
Remember we bring up the XPath Expression Wizard by selecting the appropriate XML instance and right-clicking to open the context-sensitive menu from which we choose Generate (near the bottom of the menu) -> XPath. . . If you're unsure about your location, position the wizard at the root and prepend the relative path to create an absolute path. Test your results both ways. See if the result is unchanged.
V3.1.0.1
Student Notebook
Uempty
1
title title title chapter section section title
Figure 8-12. Example: Relative Addressing
3
paper chapter
2
appendix section @status title title section title section title
XM3014.1
section title
title
section title
4
Copyright IBM Corporation 2004
Notes:
1. produces:<chapter> Chapter 2 <title>Ch. 2 Title</title> <section> Section 2.1 <title>Sect.2.1 Title</title> </section> <section status="Sect.2.2 Status"> Section 2.2 <title>Sect.2.2 Title</title> </section> </chapter> 2. produces:<section> Section 2.1 <title>Sect.2.1 Title</title> </section>
Copyright IBM Corp. 2001, 2004 Unit 8. XPath - XML Path Language 8-13
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
There are 13 axes defined in XPath that enable searching of different parts of the XML Document from the current context node or the root. The commonly used axes, such as attribute, child and descendent-or-self have a shorthand syntax. If the shorthand syntax is used, the "::" separator that follows the axis name is omitted. child:: is the default axis if no axis is specified; all axes can be used in a relative or absolute path. Despite the singular form of axes names like ancestor or preceding-sibling, only parent and self always refer to a single node.
8-15
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
..
//
XM3014.1
Notes:
There are 13 axes defined in XPath that enable searching of different parts of the XML Document from the current context node or the root. The commonly used axes, such as attribute, child and descendent-or-self have a shorthand syntax. If the shorthand syntax is used the "::" separator that follows the axis name is omitted. child:: is the default axis if no axis is specified; all axes can be used in a relative or absolute path. Despite the singular form of axes names like ancestor or preceding-sibling, only parent and self always refer to a single node.
V3.1.0.1
Student Notebook
Uempty
Ancestor
root
Preceding
title title title chapter section section title
Figure 8-15. XPath - Partitioning the Document
Following Self
section @status title appendix title section title section title
XM3014.1
Descendant
Copyright IBM Corporation 2004
Notes:
Self = Context node. For the node labeled Self, which is the current context node, the labels on the various nodes indicate their axis relationship to Self. These four axis contain all the nodes within the document, and do not overlap.
8-17
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
1
paper title title title chapter section section title
Figure 8-16. Example: Addressing with Axes
2
Copyright IBM Corporation 2004
Notes:
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
The first table lists the types or axes and the corresponding type of node returned. This list only indicates the principal node type. For example, an axis of child::* will return nodes of type element. The returned elements may have child nodes that are of type attribute. The second table lists the node tests and the resulting node (or node list). A Node Test follows the Axis in the address step and qualifies the node to be included/excluded in the search. The most common form of node test is the QName or actual element name. The wildcard ("*") node test selects all nodes of the given type. attribute::* = selects all attributes
8-19
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
These samples are shown without a representative tree. The meaning of the samples can usually be conveyed from the XPath expression itself.
V3.1.0.1
Student Notebook
Uempty
XPath - Predicates (1 of 2)
All comparisons or function calls are within the predicate, enclosed within [ ]. Predicates test a set of nodes and return one of: A new set of nodes A string A boolean A number Each node in the list of nodes is tested to see if the predicate is true. If predicate is true the node is included in the resulting list of nodes. If a predicate results in no matching nodes, an empty result set is returned.
XM3014.1
Notes:
Predicates filter a list of nodes. Predicate expressions can be function calls, numbers, literals or location paths.
8-21
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XPath - Predicates (2 of 2)
Predicate expression types Function call Number Literal Location path Extra logic operators can be used inside a predicate. Allows boolean and and or chaining of tests. Math operators allowed in predicates for numbers. Equality = operator Less than < and Greater than > operators Modulus test using the mod() function
XM3014.1
Notes:
A predicate expression can contain logical operators and - Both conditions must be true or - Either condition may be true for the test to be true. May also be expressed by using the pipe character ("|")
V3.1.0.1
Student Notebook
Uempty
id(object)
XM3014.1
Notes:
The table lists the XPath predicate functions that are part of the core function library. The return type is shown in the second column. A few functions have optional arguments. If omitted, the current context node is treated as the argument. /child::chapter[position()=1] returns the first chapter element that is under the document root. /chapter[1] is the abbreviated form of the expression above.
8-23
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Boolean
string
substring-before (string, string) substring (string, number) substring (string, number, number)
string
string
Notes:
string(object)- Only the first node of the argument node-set is converted to a string. Numbers (integer or floating point) are converted to their string representation. Booleans are converted to the strings "true" and "false". All other node types are converted depending on the type of node. For example, the string value of an element is all the characters of the element and its descendants concatenated together. substring-after(string, string)- For example, substring-after("XML Development","lop") will return the string "ment". If there is more than one occurrence of the substring, all characters after the first occurrence will be included in the returned string. substring-before function works in a similar manner, except that it returns the substring before the tested string.
V3.1.0.1
Student Notebook
Uempty
string
string
XM3014.1
Notes:
Almost any object type can be passed into string functions. The processor will attempt to convert non-string objects to their string representation.
8-25
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Descriptions Returns true if argument is false and false otherwise Returns true Returns false
XM3014.1
Notes:
Boolean function return values: Number is true if it is non-zero and not Not-a-Number Node-set is true if it is non-empty String is true if its length is non-zero NaN (Not a Number) - Is not a number value, positive and negative infinity, and positive and negative zero.
V3.1.0.1
Student Notebook
Uempty
Reference Information
Reference
http://www.w3.org/TR/xpath http://www.xml.com /pub/a/2000/12/20/xpathaxes.html /pub/a/2001/01/03/xpathaxes.html /pub/a/2000/10/04/transforming/trxml5.html http://www.zvon.org /xxl/XPathTutorial/General/examples.html http://www.zvon.org:9001/saxon/cgi-bin/XLab/XML/ xlabIndex.html?stylesheetFile=XSLT/xlabIndex.xslt http://www.cranesoftwrights.com/training/#ptux XSLT and XPath training materials, Ken Holman, Crane Softwrights Ltd.
Description
W3C XPath specification Good articles on XPath Axis and node from O'Reilly's www.xml.com site.
XSLT Programmer's Reference 2nd Edition, Michael XSLT books cover XPath. Kay, WROX Press Very good XSLT reference book
XM3014.1
Notes:
8-27
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Checkpoint Questions (1 of 3)
1. Which of the following are part of the XPath step syntax? A. Predicate B. AxisName C. Ancestor D. Ceiling E. NodeTest 2. The axis shorthand notation of // indicates what? A. Ancestor B. Parent C. Ancestor-or-self D. Descendant-or-self
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Checkpoint Questions (2 of 3)
3. Which XPath statement will return the number of questions on a test? A. count(/test/question) B. /test/question/count() C. /test[count(question)] D. None of the above 4. The predicate function starts-with("XML is Great", "XML") will return: A. XML B. True C. Is Great D. False E. XML is Great
XM3014.1
Notes:
8-29
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Checkpoint Questions (3 of 3)
5. The following XPath statement will result in --
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Unit Summary
Having completed this unit, you have learned to: Describe the reasons for using XPath Define the components and constructs that make up the XML Path Language Write simple XPath expressions Identify abbreviated XPath expressions Describe how to partition the XPath document Define the current status of XPath in industry
XM3014.1
Notes:
8-31
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
V3.1.0.1
Student Notebook
Uempty
9-1
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Unit Objectives
After completing this unit, you should be able to: Describe the XSL model and its concepts Describe and apply XSL Transformations Use and apply XSL templates in XSLT Create simple XSL stylesheets Describe some best practices for applying XSLT Describe XSLT tools
XM3014.1
Notes:
9-2
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
"A typical enterprise will devote 35-40% of its programming budget to develop and maintain 'extract and update' programs whose purpose is solely to transfer information between different database's of legacy systems." --Gartner Group
XM3014.1
Notes:
One of XSLT's best applications is to translate information from one XML vocabulary to another. As such it is a powerful tool for performing the 'extract and update' operations referred to in this quote.
9-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
DB
Printing Publishing
XM3014.1
Notes:
In order to satisfy different clients, the Web server needs separate servlets / JSPs to format data for each kind of client. No need to develop and manage separate applications (for example, JSPs/servlets) for each target device. One JSP/servlet and one XSL Stylesheet for each client. No need to rewrite your servlets/JSP every time you want to change the presentation. There is normally no need to update the JSP/servlet each time the presentation is changed. (Easier to update style of your web site). Different companies not always use the same DTDs/Schemas for their XML documents. And when exchanging XML documents there is a need to transform from one to another tree.
9-4
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
(Optional)
Transformation language A language for addressing parts of an XML document An XML Vocabulary for specifying formatting semantics
XM3014.1
Notes:
Extensible Style Language (XSL) is comprises of two parts: the XSL Transformation (XSLT) specification is a Recommendation of the W3C as of Nov. 16, 1999, and Format Objects (XSL-FO), which is part of the actual XSL specification. The XSL specification is in Candidate Recommendation, the latest, as of this writing, being Aug, 2001. For more information on XSL (and format objects), see http://www.w3.org/TR/xsl. For more information on XSLT, see http://www.w3.org/TR/xslt. XSL formatting objects and properties, allow a large array for print, display or aural presentations. It is not the aim of this unit to cover FO in depth. First, the Format Object specification is still under development; and secondly, since the data is processed by XSLT, some formatting can be done in this stage (in the case of HTML, for instance), or by the XML application itself. XSL: Extensible Stylesheet Language Consists of two modules:
Copyright IBM Corp. 2001, 2004 Unit 9. eXtensible Stylesheet Language: Transformations (XSLT) 9-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
- XSL Transformation - XSL Format Objects Is compatible with Namespaces and XPath. XSL Transformations (XSLT) Operates on an abstract model that views an XML Document as a tree. It is not required that a tree be created. Provides a means to access the document tree in order to: - Access nodes by name or content - Search for specific content or nodes - Manipulate content or nodes Serves as a transformation filter before formatting is applied. XSL Format Objects (XSL-FO) Format objects received from XSLT into a result tree.
9-6
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
XML Syntax was chosen for many reasons, among the most important were: Reuse of the XML parser minimizes footprint. Familiarity and ease of understanding. Reuse of the lexical apparatus of XML for handling whitespaces, Unicode, namespaces, and so forth. Providing visual development tools. A function is said to have side effects if it makes changes to its environment; an example is updating a global variable. The functions in XSLT have no side effects and can be processed in any order. In reality, how you code your stylesheet will impact what parts can be processed independently. Processing parts of the stylesheet in any order does not impact the order of the output. The order of the output depends on the order in the XML file - which is what we want. We will talk about programming without variables later
Copyright IBM Corp. 2001, 2004 Unit 9. eXtensible Stylesheet Language: Transformations (XSLT) 9-7
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Declarative languages are different from procedural languages. In procedural languages you specify the order for chunks of processing, in declarative languages you specify what processing you want done and the processor determines the order. Other examples of declarative languages are SQL and LISP.
9-8
Introduction to XML
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
Closure is also a characteristic of SQL. There are many similarities between XSLT and SQL.
9-9
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XSLT Features
Multiple input sources. Ability to select document fragments using XPath expressions. Named and/or pattern-based templates. Parameterized templates. Intermediate transformation state may be managed using variables. Stylesheets may be combined using include or import. Built-in support for output sorting and numbering. Both XML and non-XML output is supported. XSL processor extensions supported without side-effects on core function.
XM3014.1
Notes:
Main features of XSLT
V3.1.0.1
Student Notebook
Uempty
XSL Stylesheet
transformation
Transform Result Tree
formatting (optional)
XM3014.1
Notes:
An XSL Transformation accepts XML from the source abstract tree model of document, known as the source tree, and processes this to produce a result tree. The XSL stylesheet defines the rules for transformation, based on the XML elements and attributes in the source tree. The stylesheet also may contain formatting information called format objects (or FOs) and applies those objects against the transformation. A single stylesheet can apply to multiple XML documents, provided the elements and structure are consistent with those specified by the style sheet. Note that XSL does not require result trees to use XSL-FO and thus can be used for general XML transformations. For example, XSL can be used to transform XML into well-formed HTML, that is, XML that uses the element types and attributes defined by HTML. Note also that the source XML document can invoke multiple XSL stylesheets. For example, the XML source could be processed by XSL to render HTML, an altered form of
9-11
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XML, voice markup, and rendered for print ... all from the same source document invoking multiple stylesheets. These could occur as separate parallel processes (each invocation running an XSL processor in a separate memory space) or sequentially (each invocation running after the previous one completes). The advantage to parallel processing is that an XSLT error in one stylesheet will not prevent the others from running, whereas in sequential processing, any downstream process will be terminated as well. The disadvantage of parallel processing is one of system memory usage.
V3.1.0.1
Student Notebook
Uempty
match pattern
Source Tree
yes
no
Result Tree
XM3014.1
Notes:
XSLT uses the ideas of pattern matching and templates. A stylesheet includes templates, which contain rules that associate them with one or more elements or attributes in the XML document. The templates contain the rules for both transformation and, optionally, formatting that is applied to the matching nodes. A template can also contain further pattern matching and instructions to apply further templates. The Process The XSL processor scans the source tree (the tree model of the source XML document). If a matching node (element or attribute) is found, the processor locates the appropriate template, and then applies the rules contained within. This results in the creation of a result tree node. If a template rule indicates that more templates should be applied, then the process is begun again by finding new matching nodes. The whole procedure ends when there are no templates left to process.
Copyright IBM Corp. 2001, 2004 Unit 9. eXtensible Stylesheet Language: Transformations (XSLT) 9-13
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Matching Patterns match against nodes. When using XPath traversals, the leftmost value becomes the context (current) node, even though you have pathed to a child node that exists further down. For example, if books/book/title is the path, then books remains the context node, even though you are matching against title.
V3.1.0.1
Student Notebook
Uempty
Anatomy of a Stylesheet
Identify XML document. Must enclose the entire stylesheet, must include namespace and version. Document level elements, for example, import, include, output, strip-space, key, param, variable, ...
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL /Transform" version="1.0"> <!-- top-level elements --> one or more <xsl:template match="title"> templates <h1><xsl:apply-templates/></h1> ... </xsl:template> </xsl:stylesheet>
XM3014.1
Notes:
In this visual we have an overview of the main elements that make up the XSL stylesheet and the order they appear.
9-15
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
<xsl:comment>commentTextHere</xsl:comment> Inserts a comment into the result tree. Example: <xsl:template match="books/book//ordno [@instock('no')]"> <xsl:comment>Reorder now</xsl:comment> </xsl:template> The following comment element is inserted in the result tree: <!-- Reorder now --> <xsl:text>Inserted Text</xsl:text> Inserts text into result tree verbatim Note that this is not the same as testing for the presence of a comment within the source nodes. That is done with the comment() function in the test's predicate.
9-16 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
Uempty
<xsl:stylesheet Element
xsl:stylesheet is the root element of an XML stylesheet. Requires the XSL namespace. Current recommendation:
<?xml version="1.0" encoding="UTF-8"?>
XM3014.1
Notes:
Note that the XML processing instruction occurs first, as the XSL stylesheet is itself an XML document. The namespace portion of the specification has changed recently, and may yet change again. It is best to verify the namespace at http://www.w3.org/tr/xsl or at http://www.w3.org/tr/xslt. Backward compatibility with earlier Working Drafts, as far as can be determined, is being maintained by the W3C. Microsoft Internet Explorer 5 and 5.5 use the http://www.w3.org/TR/WD-xsl namespace; the Xalan XSL processor uses http://www.w3.org/1999/XSL/Transform. Netscape 6 does not support XSL, it supports Cascading Style Sheets (CSS). The transform namespace is synonymous, can also be expressed <xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">. Not all XSL processors, however, will recognize the transform versions (especially older ones); but technically they are equivalent.
9-17
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
For detailed usage of these, and other elements in XSL documents, see http://www.w3.org/TR/xslt. Some of the other elements are discussed later in this unit, during the detailed discussion of transformations. The order in which child elements appear is not important, however the import element must appear first if it is used, the others precede the first xsl:template element. Import versus Include Import. The import element must appear before any other child elements in the XSL document. If a rules conflict results, then imported rules are of lesser importance than those in the document doing the importing. Include. The include statement is replaced by the actual contents of the included file, moving any import statements before it. Included rules are of equal importance to those in the calling document.
V3.1.0.1
Student Notebook
Uempty
<xsl:template Element
A stylesheet has one or more templates. When <xsl:template> is missing, and literal result elements are found, a match="/" template is assumed (match the root).
<xsl:template match="match expression"> <!-- literal result text, XSLT elements --> </xsl:template>
Specifies: A match expression that defines when this rule will be used (the test against the nodes in the XML tree) - this is an XPath expression. Literal result text is written to the output tree, XSLT elements are executed.
XM3014.1
Notes:
First an explanation of the xsl:template element. It is, as its name implies, a template - a container for a set of rules to apply actions against the source tree to yield a result tree. It takes the form: <xsl:template match="nodeToMatch" [name="templateName"]> <!-- template actions insert here --> </xsl:template> If the name attribute is used, the match attribute can be excluded (see discussion below). Using the name attribute is optional, but if not used, the match attribute becomes required.
9-19
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
<xsl:apply-templates Element
Means select all of the children of the current node in the XML source tree. For each one, find the matching template rule in the stylesheet and process that rule. Rules that can be matched are: None - you are not required to have a template rule for each child node. The template match rules you define. Default rules built in to the XSL processor. Applies rules recursively.
XM3014.1
Notes:
In addition to the xsl:apply-templates element invoking further template processing; templates can be invoked by name, using the following: <xsl:call-template name="templateName"> [<xsl:with-param ...>] </xsl:call-template> This can be shortened to <xsl:call-template name="templateName"/>. The difference between xsl:apply-templates and xsl:call-template is that the current node and node list remains the same with the call function, the named template rules taking action against the current node and node list. The apply function will invoke the other templates, which may or may not change the current node and node list. Slide Example The apply-templates action occurs after the initial match and insertion of the value of the title element, would then go and apply other existing templates.
9-20 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
Uempty
Using an OR statement.
<xsl:template match="book/author | book/pub">
OR matches have the lowest priority Testing (using predicates) for a condition.
<xsl:template match="book[@bktype='paperback']">
Any XPath predicate test may be used Any test can use the not() function.
<xsl:template match="book[not(@bktype='paperback')]">
XM3014.1
Notes:
You can skip nodes in the path by using the double slash, using the example /list/book/title, if you wanted to find any title element below books, you could use the path statement list//title. This would find any title, no matter how many levels below books it was. In these examples, we will use the <xsl:template match="nodeToSearch"> element. Template Matching Rules Template rules that have greater importance are chosen over those with lesser importance. This applies where a stylesheet has been imported into another, in this case, the imported templates have lesser importance than the ones native to the importing stylesheet. Templates can be assigned a priority using the priority attribute. Other Matches comment() - To match comment children of the current node. pi() - To match processing instructions of the current node.
Copyright IBM Corp. 2001, 2004 Unit 9. eXtensible Stylesheet Language: Transformations (XSLT) 9-21
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
id() - To match values of an element's attribute of type ID. text() - To match any text node.
V3.1.0.1
Student Notebook
Uempty
This assumes that there are no elements nested in the content of <author> and <price>.
XM3014.1
Notes:
9-23
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
<xsl:value-of Element
<xsl:value-of select="patternToMatch"/>
Used to extract a specific value from the source tree. Inserts into result tree the verbatim contents of a string, an element or attribute from patternToMatch. Example:
Result
<td>Large Stories</td>
XM3014.1
Notes:
In this example, the contents of the title child of the book element will be extracted into a td element in the result tree. This element produces an output text node, the td element markup being supplied explicitly. <xsl:value-of select="."> is a common alternative to <xsl:apply-templates/> when there are no child nodes.
V3.1.0.1
Student Notebook
Uempty
Control Elements
<xsl:apply-templates/> <xsl:call-template name= "templateName"/> Calls a template named "templateName". <xsl:if> Allows a conditional test or tests. <xsl:choose> Allows a choice of one or more tests and permits a default condition: <xsl:when> the tested condition <xsl:otherwise> the default condition <xsl:for-each> Used to iterate over the result of an XPath expression.
XM3014.1
Notes:
These elements are used to control the flow of the processing.
9-25
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
<title>
<price>
"John Smith"
"New Cars"
"$8.00"
Books.xml
XM3014.1
Notes:
Complete listing of files used in example to transform Books.xml to HTML using Books.xsl as a stylesheet. HTML produced by XSLT must be XHMTL compliant so that it is a valid XML tree structure which is produced. If you have invalid HTML (ex. <br> with not closing tag), the XSLT processor will throw an error.
V3.1.0.1
Student Notebook
Uempty
This is a tree structure with <body> as the child of <html>, <h1>, <table> are the children of <body>, and so forth.
Copyright IBM Corporation 2004
XM3014.1
Notes:
HTML produced by XSLT must be XHTML compliant so that it is a valid XML tree structure which is produced. If you have invalid HTML (ex. <br> with not closing tag), the XSLT processor will throw an error.
9-27
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XML to HTML (1 of 5)
<?xml version = "1.0" encoding = "UTF-8"?> <list> <book ID = "888"> <author>John Smith</author> <title>New Cars</title> <price>$8.00</price> </book> Books.xml ...
Processor looks for <xsl:template match = "/"> which matches our root element list. Found! Copies non-XSLT elements to the output tree in list template. So we get the first part of our HTML.
<?xml version="1.0" ?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head><title>Book List</title></head> <body> <table border="1" cols="3" width="100%" > <tbody> <xsl:apply-templates /> </tbody> </table> </body> </html> ... (remaining templates ommitted for clarity)
<html> <head><title>Book List</title></head> <body> <table borders="1" cols="3" width="100%"> <tbody>
HTML Output
Books.xsl
Copyright IBM Corporation 2004
XM3014.1
Notes:
First pattern match is the root element (in our case <list>). In this case it would not matter if it is "match=/" or "match="/" or match="list" plain html code is transferred over to the output tree. HTML produced by XSLT must be XHMTL compliant so that it is a valid XML tree structure that is produced. If you have invalid HTML (ex. <br> with not closing tag), the XSLT processor will throw an error.
V3.1.0.1
Student Notebook
Uempty
XML to HTML (2 of 5)
... <book ID = "888"> <author>John Smith</author> <title>New Cars</title> <price>$8.00</price> </book> ...
Books.xml
While processing match="/", we come to <xsl:apply-templates/>, the processor looks for templates for the children of "list" (that is, book), finds <xsl:template match="book">, and processes that template. <xsl:value-of select="@ID"> writes the value of the attribute ID to the output tree.
<html> <head><title>Book List</title></head> <body> <table borders="1" cols="3" width="100%"> <tbody> <tr> <td>888</td>
<xsl:template match="/"> ... <xsl:apply-templates /> ... </xsl:template> <xsl:template match="book"> <tr> <td><xsl:value-of select="@ID" /></td> <xsl:apply-templates select="author|price"/> </tr> </xsl:template>
HTML Output
Books.xsl
Copyright IBM Corporation 2004
XM3014.1
Notes:
<xsl:apply-templates> informs the processor to go to the next template match. In this case match="book".
9-29
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XML to HTML (3 of 5)
... <book ID = "888"> <author>John Smith</author> <title>New Cars</title> <price>$8.00</price> </book> ...
Books.xml
While processing match="book", we come to <xsl:apply-templates select="author|price"/>, the processor looks for templates for the author and price children of book. Finds <xsl:template match="author | price">, and processes that template. <xsl:value-of select ="."> writes the value of the element node to the output tree.
<html> <head><title>Book List</title></head> <body> <table borders="1" cols="3" width="100%"> <tbody> <tr> <td>888</td> <td>John Smith</td> <td>$8.00</td> </tr>
<xsl:template match="book"> <tr> <td><xsl:value-of select="@ID" /></td> <xsl:apply-templates select="author|price"/> </tr> </xsl:template> <xsl:template match="author|price"> <td><xsl:value-of select="." /></td> </xsl:template>
Books.xsl
Copyright IBM Corporation 2004
HTML Output
XM3014.1
Notes:
<xsl:templates match= " author | price "> The | is "or" from XPath The processor will call this template once for each author node, which is a child of book and once for each price, which is a child of book Later we will look at other options for generating the same result (in XSLT we have many processing options)
V3.1.0.1
Student Notebook
Uempty
XML to HTML (4 of 5)
... <book ID = "999"> <author>Dan Big</author> <title>Large Stories</title> <price>$7.00</price> </book> </list> Books.xml
Processor now looks for and finds another book node to process. Output for that book node is added to the output tree.
<xsl:template match="book"> <tr> <td><xsl:value-of select="@ID" /></td> <xsl:apply-templates select="author|price"/> </tr> </xsl:template> <xsl:template match="author|price"> <td><xsl:value-of select="." /></td> </xsl:template>
(Other templates ommitted for clarity)
<html> .. <tr> <td>888</td> <td>John Smith</td> <td>$8.00</td> </tr> <tr> <td>999</td> <td>Dan Big</td> <td>$7.00</td> </tr>
Books.xsl
Copyright IBM Corporation 2004
HTML Output
XM3014.1
Notes:
After processing the first <book> node, processing of the next book element takes place.
9-31
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XML to HTML (5 of 5)
<?xml version = "1.0" encoding = "UTF-8"?> <list> <book ID = "888"> <author>John Smith</author> <title>New Cars</title> <price>$8.00</price> </book> Books.xml ...
Processor now finishes the processing of the root node and adds end tags for table, BODY, and HMTL.
<xsl:template match="/"> <html> <h1><p>Book List</p></h1> <body> <table border="1" COLS="3" width="100%" > <xsl:apply-templates /> </table> </body> </html> </xsl:template> (Other templates ommitted for clarity)
Books.xsl
Figure 9-26. XML to HTML (5 of 5)
<html> .. <tr> <td>888</td> <td>John Smith</td> <td>$8.00</td> </tr> <tr> <td>999</td> <td>Dan Big</td> <td>$7.00</td> </tr> </tbody> </table> HTML </body> </html>
Output
XM3014.1
Notes:
After processing the first <book> node, processing of the next book element takes place.
V3.1.0.1
Student Notebook
Uempty
Calling <xsl:apply-templates/>
For greater control over which nodes are processed, explicitly call the templates for specific nodes. You are not looking for a specific template, but for the template for the specific node. In our prior example, we decided that only the price and author of the book should be output not the title.
<xsl:template match="book"> <tr> <td><xsl:value-of select="@ID" /></td> <xsl:apply-templates select="author|price" /> </tr> </xsl:template> <xsl:template match="author|price"> <td><xsl:value-of select="." /></td> </xsl:template>
XM3014.1
Notes:
Use XPath statements to 'select' the nodes whose templates you are looking for.
9-33
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Named Templates
Templates can be named and invoked by name. All templates must have a name or a match attribute. When calling a template, the current node remains unchanged. Defining a named template
<xsl:template name="bookTitle" > (must be a valid XML name) <h1><xsl:value-of select="."/></h1> </xsl:template>
XM3014.1
Notes:
The difference between xsl:apply-templates and xsl:call-template, is that the current node and node list remains the same with the call function, the named template rules taking action against the current node and node list. The apply function will invoke the other templates, which may or may not change the current node and node list.
V3.1.0.1
Student Notebook
Uempty
<xsl:for-each Element
<xsl:for-each select="nodeSetExpression">
Used to iterate over the result of the select expression. Selected node becomes the current node.
<list> <book ID="666"> <author>Jim Blue</author> <title>Blue Flowers</title> </book> <book ID="888"> <author>John Smith</author> <title>New Cars</title> </book> <book ID="999"> <author>Dan Big</author> <title>Large Stories</title> </book> </list>
Books.xsl
<p>Blue Flowers</p> <p>New Cars</p> <p>Large Stories</p>
Books.xml
Copyright IBM Corporation 2004
Books.html
XM3014.1
Notes:
Using for-each instead of apply-templates, is often called a pull model of processing, because you are explicitly choosing when to process the nodes. Best used when data is regular and predicatable. The for-each mechanism can be nested.
9-35
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
<xsl:if Element
<xsl:if test="patternToMatch">
Used to conditionally process the matched expression. Can also be used with the logical not statement.
<list> <book ID="666"> <author>Jim Blue</author> <author>Mike Yellow</author> <author>Dan Farm</author> <title>Blue Flowers</title> </book> </list>
Books.xml
<xsl:template match="list/book"> <xsl:value-of select="title"/> by <xsl:for-each select="author"> <xsl:value-of select="." /> <xsl:if test="position()!=last()">, </xsl:if> <xsl:if test="position()=last()-1"> and </xsl:if> </xsl:for-each> </xsl:template>
Copyright IBM Corporation 2004
Books.xsl
XM3014.1
Notes:
The xsl:if conditional can be used to test for a certain situation within a template. It can be used in conjunction with other actions. More than one xsl:if action can appear within a template. Note that the condition test also contains a logical OR (British Pounds or Yen.)
9-37
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
<xsl:choose Element
<xsl:choose> <xsl:when test="testCondition"> <!-- ... other actions ... --> </xsl:when> <xsl:otherwise> <!-- ... alternative actions ... --> </xsl:otherwise> </xsl:choose> Multiple when tests can be implemented. The otherwise element is optional and must be the last child element of <xsl:choose> when present (used as a default if the other tests fail).
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
<xsl:choose Example
<list> <book ID="666"> <chapter>First Chapter</chapter> <chapter>Second Chapter</chapter> <appendix>XSLT reference</appendix> </book> </list>
Books.xml
<xsl:stylesheet xmlns:xsl='... <xsl:template match="//book"> <xsl:for-each select="*"> <p> <xsl:choose> <xsl:when test='name()="chapter"'>Chapter: </xsl:when> <xsl:when test='name()="appendix"'>Appendix: </xsl:when> <xsl:otherwise>Index: </xsl:otherwise> </xsl:choose> <xsl:value-of select="." /> </p> </xsl:for-each> </xsl:template> </xsl:stylesheet>
Books.html
Books.xsl
Copyright IBM Corporation 2004
XM3014.1
Notes:
9-39
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Very common use of XSLT is to translate and transform from one XML vocabulary to another XML vocabulary. XSLT provides some built in elements to help with these types of transformations. Remember: we are always building an output tree. Now we will look at ways to add nodes directly onto that output tree. <xsl:copy [use-attribute-sets]> ... </xsl:copy> Copies the current nodes from source tree to result tree Example <xsl:template match="list/book"> <xsl:copy> <xsl:apply-templates select="title"/> </xsl:copy>
9-40 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
Uempty
9-41
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
<xsl:element Element
Creates an element in the result tree. Content (inside xsl:element) can be: <xsl:attribute> (create attribute) <xsl:element> (create child element) Text
<xsl:element name="element-name"> <!-- content: attributes, child elements, text --> </xsl:element>
XM3014.1
Notes:
The attributes are always inside a element. Elements may be inside of elements (child element).
V3.1.0.1
Student Notebook
Uempty
<xsl:attribute>
Creates an attribute in the result tree. Content is the text value for the attribute. All attributes must precede first child element. <xsl:attribute name="attribute-name"> <!-- content: text value --> </xsl:attribute> Example:
create an attribute named "id"
XM3014.1
Notes:
9-43
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
output .xml
Copyright IBM Corporation 2004
XM3014.1
Notes:
In this example, we want to transform the input.xml into the output.xml. What changed: Restructured. Was a company list by div, dept, & emp; now a company list by employee. Data from most attributes is now in sub-elements. Element and attribute names.
V3.1.0.1
Student Notebook
Uempty
output .xml
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/"> <xsl:element name="company"> <xsl:for-each select="//emp"> <xsl:element name="employee"> <xsl:attribute name="id"> <xsl:value-of select="@no"/> </xsl:attribute> <name><xsl:value-of select="@name"/></name> <xsl:element name="division"> <xsl:value-of select="../../@no"/> transform.xsl </xsl:element> ...
XM3014.1
Notes:
In this example, we want to transform the input.xml into the output.xml.
9-45
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
This number formatting is different then the function format-number() and <xsl:decimal-format>.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
The xsl:number element provides a function similar to a programming language's number function; it allows number formatting, establishment of boundaries, and other parameters. Numbering formats are covered further in the next slide, however, in the element sample shown, several optional attributes were not shown; they are: letter-value ... disambiguates between numbering sequences that use letters. In many languages there are two commonly used numbering sequences that use letters (that is, in English a, b, c, and so forth, and i, ii, iii). One numbering sequence assigns numeric values to letters in alphabetic sequence, and the other assigns numeric values to each letter in some other manner traditional in that language. In English, these would correspond to the numbering sequences specified by the format tokens a and i. In some languages, the first member of each sequence is the same, and so the format token alone would be ambiguous. A letter-value value of "alphabetic" specifies the alphabetic sequence; a value of "traditional" specifies the other sequence. If the letter-value attribute is not specified, then the XML application must resolve the ambiguity. grouping-separator ... specifies a character used between groups of digits.
Copyright IBM Corp. 2001, 2004 Unit 9. eXtensible Stylesheet Language: Transformations (XSLT) 9-47
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
lang ... specifies the xml:lang environment (a system environment that complies with the ISO language standard. American English, for example is EN, to specify British, EN-uk is used. grouping-size ... specifies the number of digits in each group. from ... used to set a limit to the level of ancestry that is searched, its value specifies an element name from which to start the count. The level attribute specifies what levels of the source tree to count, climbing the hierarchy to search for patterns to match against. The attribute takes one of these three values: single ... numbers the count attribute's matches that are siblings. multi ... numbers the count attribute's matches that are children of the current element's ancestors (but will not travel deeper than the current node.) any ... numbers any match of the count attribute anywhere in the document (but will not travel deeper than the current node.) There are other number possibilities available with the <xsl:number/> element, see http://www.w3.org/TR/xslt, section 7.7, for more information.
V3.1.0.1
Student Notebook
Uempty
Description
Use standard numbers (1, 2, 3, 4 ... etc.) Use standard capital letters (A, B, C, etc.) Use standard lowercase letters (a, b, c, etc.) Use lowercase Roman numerals (i, ii, iii, iv, etc.) Use capital Roman numerals (I, II, III, IV, etc.) Use katakana numbering Use katakana number in iroha order Use Thai digits for numbering Use traditional Hebrew; letter-value value is "other" Use Gregorian; letter-value value is "other" Use Classical Greek; letter-value value is "other" Use Old Slavic; letter-value value is "other"
XM3014.1
Notes:
The entity values shown are hexadecimal values from UTF-8, and are ISO standards. Other language-specific schemes may be supported as well; in addition, UTF-8 allows user-defined assignments. How this is supported in language-dependant numbering schemes is not clear; in such cases, it would probably be best to make it an XML-application issue and not an XSLT-processor issue.
9-49
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
<xsl:number Example
<list> <book ID="666"> <chapter><title>Mission Statement</title></chapter> <chapter> <title>Organization</title> <section><title>SCM</title></section> <section><title>CRM</title></section> </chapter> <chapter> <title>Departments</title> <section><title>Executive</title></section> <section><title>Financial</title> <clause><title>Accounts Payable</title></clause> <clause><title>Accounts Receivable</title></clause> </section> </chapter> </book> Books.xml </list> <xsl:stylesheet version='1.0' xmlns:xsl='http:... <xsl:output method="text" /> <xsl:template match="/"> <xsl:for-each select="list/book//title"> <xsl:number level="multiple" format="1.A.a. " count="chapter | section | clause"/> <xsl:value-of select="."/> </xsl:for-each> </xsl:template> Books.xsl </xsl:stylesheet> 1. Mission Statement 2. Organization 2.A. SCM Result 2.B. CRM 3. Departments 3.A. Executive 3.B. Financial 3.B.a. Accounts Payable 3.B.b. Accounts Receivable
XM3014.1
Notes:
In this example, the two templates would number chapter elements. This is intended for a document that contains a sequence of chapters followed by a sequence of appendices, where both chapters and appendices contain sections, which in turn contain subsections. Chapters are numbered 1, 2, 3; Subchapters A,B,C and sub subchapters a,b,c.
V3.1.0.1
Student Notebook
Uempty
<xsl:sort Element
<xsl:sort select="strExp" lang="nmtoken" data-type= {"text"|"number"|qname} order={"ascending"|"descending"} case-order={"upper-first"|"lower-first"} />
A means to sort a collection of nodes in ascending or descending order. Provides alphabetic or numeric sorting. Used within <xsl:apply-templates> or <xsl:for-each> Sort before numbering!
XM3014.1
Notes:
Sorting is specified by adding xsl:sort elements as children of an xsl:apply-templates or xsl:for-each element. The first xsl:sort child specifies the primary sort key, the second xsl:sort child specifies the secondary sort key and so on. When an xsl:apply-templates or xsl:for-each element has one or more xsl:sort children, then instead of processing the selected nodes in document order, it sorts the nodes according to the specified sort keys and then processes them in sorted order. When used in xsl:for-each, xsl:sort elements must occur first. When a template is instantiated by xsl:apply-templates and xsl:for-each, the current node list collection consists of the complete list of nodes being processed in their sorted order. The select attribute value is an expression: for each node to be processed, the expression is evaluated with that node as the current node and with the complete list of nodes being processed in unsorted order as the current node list. The resulting object is converted to a string (as if by a call to the string function); this string is used as the sort key for that node collection. The default value of the select attribute is "." (self), which will cause the string-value of the current node to be used as the sort key.
9-51
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
<xsl:sort Attributes
order (establishes the sorting order) Value: ascending | descending lang (language encoding, per xml:lang) Default: EN-us (is of type nmtoken) data-type (assigns data types to node strings) Value: text | number | QName case-order (sets precedence for upper or lower case) Value: upper-first | lower-first Default: lower-first (for US English, see notes)
XM3014.1
Notes:
data-type: The value "text" does not imply a text string; number causes the sort to be evaluated by numeric equivalence; a qualified name (see Namespaces) may be used. As XML Schema is adopted, other data-types will be added (per W3C Note). case-order: upper-first gives a language precedence of A a B b, and so forth (in English), while lower first would be a A b B, and so forth. The default value is language dependent. A W3C note in section 10, Sorting, states "It is possible for two conforming XSLT processors not to sort exactly the same. Some XSLT processors may not support some languages. Furthermore, there may be variations possible in the sorting of any particular language that are not specified by the attributes on xsl:sort, for example, whether Hiragana or Katakana is sorted first in Japanese. Future versions of XSLT may provide additional attributes to provide control over these variations. Implementations may also use implementation-specific namespaced attributes on xsl:sort for this. It is recommended that implementors consult Unicode TR10 for information on internationalized sorting (see http://www.unicode.org/unicode/reports/tr10/index.html for details)."
V3.1.0.1
Student Notebook
Uempty
Sort Example
<List> <word id="Czech"/> <word id="czech"/> <word id="cook"/> <word id="Took"/> <word id="took"/> <word id="TooK"/> </List>
words.xml
<xsl:template match="/"> <table> <tbody> <xsl:for-each select="//word"> <xsl:sort select="."/> <tr><td> <xsl:value-of select="."/></td></tr> </xsl:for-each> </tbody> </table> </xsl:template>
<table> <tbody> <tr><td>cook</td></tr> <tr><td>czech</td></tr> <tr><td>Czech</td></tr> <tr><td>took</td></tr> <tr><td>Took</td></tr> <tr><td>TooK</td></tr> </tbody> </table>
HTML Output
sort.xsl
Copyright IBM Corporation 2004
XM3014.1
Notes:
Sort example where case-order is used to sort.
9-53
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XPath/XSLT Functions
Category type conversion arithmetic string manipulation aggregation get node information Boolean get context information find nodes get processor information XPath boolean(),string(),... round(), ceiling(),... concat(), substring(),... count(), sum() local-name(), name() not(), false(), true() last(), position() (none) (none) XSLT format-number() (none) (none) (none) generate-id(), lang(), unparsed-entity-url() (none) current() document(), key(), id() element-available(), function-available(), system-property()
XM3014.1
Notes:
Not all XPath functions are listed. Refer to XPath LO. List of function that can be used when transforming. document - finds an external document by resolving a URI reference key - used to find the nodes with a given value for a named key. Used in conjunction with <xsl:key> format-number - convert numbers into strings to display formatted current - returns the single current node unparsed-entity-uri - gives access to declarations of unparsed entities in the DTD of the source document generate-id - generates a unique id that identifies the node (might have a different result for each parser) system-property - returns information about the processing environment
V3.1.0.1
Student Notebook
Uempty
element-available - checks if a particular XSLT instruction or element is available function-available - checks if a function is available, might be used to test if a certain extended function is available
9-55
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Other Elements
Attribute value templates Variable Parameter Not Covered Key <xsl:apply-imports> Applies imported stylesheet templates against the current node and its children
XM3014.1
Notes:
<xsl:apply-imports> will generate an error condition if <xsl:import has not been declared and href'd at the beginning of the style document.
V3.1.0.1
Student Notebook
Uempty
XM3014.1
Notes:
Basically a simplified use of <xsl:attribute.
9-57
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
<xsl:stylesheet version='1.0' ... <xsl:template match="/"> <xsl:apply-templates select="applet" /> </xsl:template> <xsl:template match="applet"> <applet code="{code/@class}" codebase="{codebase}/java" /> </xsl:template> </xsl:stylesheet>
XM3014.1
Notes:
In the above XSL template example, if the input XML is: <applet> <code class="javaApplet"/> </applet> The result of the template processing would be: <applet code="javaApplet" codebase="/src/code/java"/> Notice that the class attribute's value was immediately implemented in the output stream, as was the constant(codebase) value without further action statements being required in the template.
V3.1.0.1
Student Notebook
Uempty
XSLT Processors
Xalan - www.apache.org Was supplied to Apache by Lotus (LotusXSL) MSXSL - Microsoft Internet Explorer 5.x, 6 Command line XT (by James Clark, now fading away) SAXON (from Michael Kay, author of XSLT Programmer's Reference) For a more extensive list: http://www.w3.org/Style/XSL/
XM3014.1
Notes:
List of main XSL processors.
9-59
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Xalan
Named after a rare Persian musical instrument. http://www.apache.org Java version requires JDK/JRE 1.1.8, 1.2.2, or 1.3: Comes with Xerces as XML parser
Can use other XML parsers
Now included in JDK/JRE 1.4 C++ version available. XSLTC compiler generates a translet that generally provides better performance.
XM3014.1
Notes:
For more information on XALAN http://www.apache.org/ Xalan includes XSLTC a stylesheet compiler that was donated by Sun Microsystems. Normally XSLT stylesheet are interpreted each time they are used. XSLTC compiles the stylesheet into set of Java classes. This can speed up stylesheet processing fairly dramatically. XSLTC is not quite XSLT 1.0 compliant but steady progress towards compliance is being made.
V3.1.0.1
Student Notebook
Uempty
ibm.com/developerWorks/speakers/colan XSL by Example presentation, companion files Other presentations on XML and Web Services
XM3014.1
Notes:
9-61
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XSL References
Reference
http://www.w3.org/Style/XSL/
http://www.cranesoftwrights.com/ training/#ptux
Description
W3C Specifications (XSL, XSLT, XPath) Practical Transformation Using XSLT and XPath training materials, Ken Holman, Crane Softwrights Ltd (Mailing list for general XSL questions) Various XSL-specific references Chapter 17 of XML Bible, Elliotte Rusty Harold, IDG Books
XSLT Programmer's Reference, Michael Kay, Wrox Press XSLT in a Nutshell, Doug Tidwell, O'Reilly XSL Companion, Neil Bradley, Addison-Wesley
Copyright IBM Corporation 2004
XM3014.1
Notes:
V3.1.0.1
Student Notebook
Uempty
Checkpoint Questions
1. How can XML documents be transformed? A. XPath B. XSLT C. Notepad D. Xatran 2. Is an XSL Stylesheet an XML document? A. Yes B. No C. Depends on the header D. Only if it is applied to a XML document 3. What template would you use for extracting a specific value from the source tree? A. <xsl:choose... B. <xsl:copy ... C. <xsl:value-of select=... D. <xsl:text>
Copyright IBM Corporation 2004
XM3014.1
Notes:
9-63
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Unit Summary
In this unit we learned: The XSL/XSLT model and concepts Transformation XSL templates and pattern matching XSL elements and their attributes How to create simple XSL style sheets Some XSL Best Practices XSL Tools
XM3014.1
Notes:
V3.1.0.1
Student Notebook
AP
A-1
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Unit Objectives
After completing this unit, you should be able to: List the characteristics of an XML document that help determine the right type of database Define and describe content management databases Compare relational database structures to XML document structures List the limitations of relational data tables with structured data Define and describe what Object-Oriented databases provide Describe the status of XML-based queries
XM3014.1
Notes:
Currently, the database implementation of XML has lagged at the W3C. Most relational database management systems (RDBMS) use some form of filtering or mapping to deal with XML data.
A-2
Introduction to XML
V3.1.0.1
Student Notebook
AP
Considerations
Start with what you are using the database for. What type of application are you supporting? Is XML being used as a transport between the database and the application? Are you using legacy data? Are you more interested in the data or in the document structure? Are you storing Web pages or Web pages' content? Is your data used by other, perhaps non-XML, applications? Are you updating the DB from XML?
XM3014.1
Notes:
These are questions to think about as you start evaluating what database and how to use your database to support your business goals.
A-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Examples of Data-centric are many e-commerce applications and almost all B2B applications. Examples of Document-centric are publishing applications. An example of a mixed application would be a store selling books where the information about the shopping cart is very data-oriented, but the information about the book or reviews is very content oriented.
A-4
Introduction to XML
V3.1.0.1
Student Notebook
AP
Types of Databases
Remember: An XML document is a hierarchical, ordered, and untyped document.
Relational Database (RDB) structures are not hierarchical. Much of the world's current data exists in RDBs. Object-Oriented Databases (OODB) are slow to catch on, but show promise of storing XML data objects. Existing OODBs may have complex relationships. Native XML database or content-management systems are designed specifically to store XML. Oriented towards the document-oriented XML systems. Existing database systems must use some type of attachment or filter to deal with XML data. Many RDB vendors are building this capability into their products.
XM3014.1
Notes:
Databases and XML Relational Databases - Deal with structures in rows and columns. While in a simple database model, it would be easy to map XML structures to match, a problem occurs when a field in the database is related to another row/column structure. While there have been many approaches by database vendors, the incorporation of XML (structured) data is unique to each RDBMS vendor. Object-Oriented Databases - Deal with object relationships rather than the typical row/column approach of RDBs. While the concept of such databases has picked the interest of database engineers, it has been slow to catch on in actual usage. OODBs are usually chosen when there are complex relationships in the data which would be difficult to support in an RDB. These complex relationships are also likely to be difficult to map to XML's hierarchical structure. Native XML database/content management systems make storing and retrieving XML very easy, but will not easily support non-XML oriented applications.
A-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Use the XML parser to create XML DOM from the SQL result set Use the XSLT processor to extract the required element from the XML source tree
XML Apps
XML editing application to update the element contents of the result DOM
Insert updated element object into source DOM Use XSLT to write DOM to SQL Parse XML docs with SAX
XM3014.1
Notes:
This example describes how a Java-based application may access an XML document stored in a database table, and then update an element's contents. The main disadvantage of this approach is that XML documents must be extracted then manipulated outside the database by the application (note that this can hinder performance greatly), and then be written back. Note that the above example does not include document validation. In effect, middleware solutions such as XML Extender enable databases to become an XML repository, where many of the above problems are overcome.
A-6
Introduction to XML
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
Many of these challenges do not apply to the case where you are storing XML documents into a single column in an RDB. The challenges that still apply are: Character encoding Validation Binary Data Null Data: - In RDBs null data exist and are different than considering them as 0 (zero). They simply don't exist. Storing Markup <description> <b>Confusing example:</b> <foo/> </description>
A-7
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Department
dept-nbr X333 Z568 ... department-name XML developers Human Resources
Employee
ID last 250243 Smith ... first John dept-nbr X333 X333 ... ...
OR
XML_Table
Key XML_Doc X333 <department> <dept-nbr>X333</dept-nbr> <department-name>XML developers</department-name> <employee> <last>Smith</last> ... Z568 <department> <dept-nbr>Z568 ....
Copyright IBM Corporation 2004
XM3014.1
Notes:
This example shows the two major options for decomposing an XML document into a relational database. In one case, 2 tables are created to store the information with the parent element becoming the table name and the child elements mapped to columns. Also all the information for both the department and employee could be stored a a single table. Results in many columns with null values. Another option would be to store the XML document as a CLOB without decomposing it into a relational tables and providing a 'id' based lookup.
A-8
Introduction to XML
V3.1.0.1
Student Notebook
AP
Department
dept-nbr X333 Z568 ... department-name XML developers Human Resources
Employee
ID last 250243 Smith first John dept-nbr X333 X333 ... ...
OR
<employeeID="432453" deptnbr="X333"> <last>Adams</last> <first>Tom</first> <phone>544-4444</phone> <e-mail>[email protected]</e-mail> </employee> </department>
XML_Table
Key XML_Doc <department deptnbr="X333"> <department-name> X333 XML developers</department-name> <employee ID="250243" deptnbr="X333">... <department deptnbr="Z568"> Z568 <department-name>...
XM3014.1
Notes:
This example shows the same content of the previous example but now using attributes to store information.
A-9
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
RDB with no XML support. By storing XML documents in relational databases there is always the challenge of how to validate documents. Retrieval and updates might be resource consuming due to the lack functions.
V3.1.0.1
Student Notebook
AP
Object-Oriented Databases
Object-Oriented Database (OODB) features: Persistence of objects. Extend semantics of O-O programming languages. Unification of data model and database structure. Requires less code. Ease of code base maintenance. Relational Database (RDB) comparison: Data structures must be flattened to fit joined tables. Structures maintained in memory. No built-in object management. OODB real-world applications: Risk analysis systems, telecom systems, WWW document structures, design and manufacturing systems, hospital patient record systems with complex data interrelationships.
XM3014.1
Notes:
For more information and links to OODB resources, see http://www.objenv.com/cetus/oo_data_bases.html.
A-11
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
No need to compose or decompose the XML document into columns into the database. Since data is not decomposed in columns, search for specific views have to go though each document.
V3.1.0.1
Student Notebook
AP
Model-based storage
Store a DOM presentation of the XML document into an existing or custom data store. May use an RDB underneath. Roundtrip at the level of the underlying model (can maintain order).
XM3014.1
Notes:
Structure of the document remains unchanged.
A-13
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Database
XM3014.1
Notes:
One of the main issues when working with XML documents and databases is it's character encoding. The document may be in one encoding and the database in a different one. Many databases do not support Unicode and require special setup for non0ASCII characters. There is no general way to solve this problem. You must be aware of it and address it on a case by case basis. DBCS - double-byte character set (ex: Chinese) SBCS - single-byte character set Unicode- A character coding system designed to support the interchange, processing, and display of the written texts of the diverse languages of the modern world. Unicode characters are normally encoded using 16-bit integral unsigned numbers. UCS - Universal Multiple Octet Coded Character Set UTF-8 - UCS Transformation Format.
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
DB2 and XML Extender: DXX DB2 XML Extender provides a range of functionality for managing XML documents using traditional and nontraditional data. Some of the areas of functionality that the XML extender provides includes facility for storage, fast searching, validation and composition/decomposition of XML documents.
A-15
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
XML documents can be used to describe database schemas, including columns, foreign keys, constraints, and so forth. XML documents may be used to exchange data between different database vendor by being an vendor independent way to store data. Exchanging data between system/vendors can be very useful. This could be accomplished, for example, by extracting schema information and then the data. This could be in one or more XML files. The procedure above can be used to export and then load data into a database.
V3.1.0.1
Student Notebook
AP
Probably not optimal schema for the other system, but good starting point or good enough for your use. Likely to ease mapping between the DB and XML documents. Design time, not run time, activity.
XM3014.1
Notes:
Working through meta data of one system may ease your way of working for the other system.
A-17
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
CREATE TABLE employee (emp_nbr Char(10) NOT NULL PRIMARY KEY, dept_nbr Char(6), type Varchar(40), last Varchar(40), first Varchar(40));
XM3014.1
Notes:
This example shows how a database table can be represented by an XML document. In this example we show how database columns and keys can be described using XML. Note that this could be accomplished in different ways.
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
The XQuery 1.0 and XPath 2.0 Data Model is currently a Working Draft of the W3C, June 7, 2001; the Algebra and Language specifics have yet to be addressed by the W3C working group. For current information on the Data Model and updates on the other specifications, see http://www.w3.org/XML/Query. The following areas are addressed in the XQuery Requirements specification, a Working Draft (June 2001), of the W3C. For more information on the XQuery Requirements, see http://www.w3.org/TR/xmlquery-req.
A-19
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Human-readable documents Perform queries on structured documents and collections of documents, such as technical manuals, to retrieve individual documents, to generate tables of contents, to search for information in structures found within a document, or to generate new documents as the result of a query. Data-oriented documents Perform queries on the XML representation of database data, object data, or other traditional data sources to extract data from these sources, to transform data into new XML representations, or to integrate data from multiple heterogeneous data sources. The XML representation of data sources may be either physical or virtual; that is, data may be physically encoded in XML, or an XML representation of the data may be produced.
V3.1.0.1
Student Notebook
AP
Mixed model documents Perform both document-oriented and data-oriented queries on documents with embedded data, such as catalogs, patient health records, employment records, or business analysis documents. Administrative data Perform queries on configuration files, user profiles, or administrative logs represented in XML. Stream filtering Perform queries on streams of XML data to process the data in a manner analogous to UNIX filters. This might be used to process logs of e-mail messages, network packets, stock market data, newswire feeds, EDI, or weather data to filter and route messages represented in XML, to extract data from XML streams, or to transform data in XML streams. DOM queries Perform queries on DOM structures to return sets of nodes that meet the specified criteria. Native XML repositories Perform queries on collections of documents managed by native XML repositories or web servers. Catalog search Perform queries to search catalogs that describe document servers, document types, XML schemas, or documents. Such catalogs may be combined to support search among multiple servers. A document-retrieval system could use queries to allow the user to select server catalogs, represented in XML, by the information provided by the servers, by access cost, or by authorization. Once a server is selected, a retrieval system could query the kinds of documents found on the server and allow the user to query those documents. Multiple syntactic environments Queries may be used in many environments. For example, a query might be embedded in a URL, an XML page, or a JSP or ASP page; represented by a string in a program written in a general-purpose programming language; provided as an argument on the command-line or standard input; or supported by a protocol, such as DASL or Z39.50.
A-21
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
More Information
Reference
http://www-106.ibm.com/ developerworks/xml/library/ x-matters8/index.html http://www.rpbourret.com/xml/ XMLAndDatabases.htm#intro
Description
Putting XML in context with hierarchical, relational, and object-oriented models by David Mertz XML and Databases by Ronald Bourret
http://www.rpbourret.com/xml/ XML Database products by XMLDatabaseProds.htm#xmlservers Ronald Bourret http://www-106.ibm.com/ developerworks/library/x-struct/ http://www.xml.com/pub/a/2001/05/ 09/dtdtodbs.html XML Structures for Existing Databases by Kevin Williams and others Mapping DTDs to Databases by Ronald Bourret
XM3014.1
Notes:
V3.1.0.1
Student Notebook
AP
Checkpoint Questions (1 of 2)
1. How can an XML document be stored in an RDB? (select all that apply): A. In a Table column (CLOB) B. SGML C. Decomposed into different columns/tables D. Into a DTD file E. Compressed into an integer column 2. While RDBs are row-based XML documents are: A. Record based B. Hierarchical C. Obsolete D. Rectangular
XM3014.1
Notes:
A-23
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Checkpoint Questions (2 of 2)
3. I should use an RDB to store my XML if: (select all that apply) A. I have lots of proprietary file formats B. I need to retrieve large number of documents based on a specific element C. I need to exchange data with a business partner D. I need to represent my data in Esperanto
XM3014.1
Notes:
V3.1.0.1
Student Notebook
AP
Unit Summary
In this unit we learned: How to compare relational database structures to XML document structures. The limitations of relational data tables with structured data. What Object-Oriented databases provide. The status of XML-based queries.
XM3014.1
Notes:
A-25
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
V3.1.0.1
Student Notebook
AP
B-1
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
see (2)
Notes: (1) To retain compatibility between XML Schema and XML 1.0 DTDs, the simple types ID, IDREF, IDREFS, ENTITY, ENTITIES, NOTATION, NMTOKEN, NMTOKENS should only be used in attributes. (2) A value of this type can be represented by more than one lexical format, e.g. 100 and 1.0E2 are both valid float formats representing "one hundred". However, rules have been established for this type that define a canonical lexical format, see XML Schema Part 2. (3) Newline, tab and carriage-return characters in a normalizedString type are converted to space characters before schema processing. (4) As normalizedString, and adjacent space characters are collapsed to a single space character, and leading and trailing spaces are removed. (5) The "g" prefix signals time periods in the Gregorian calender. otes: (1) To retain compatibility between XML Schema and XML 1.0 DTDs, the simple types ID, IDREF, IDREFS, ENTITY, ENTITIES, NOTATION, NMTOKEN, NMTOKENS should only be used in attributes. (2) A value of this type can be represented by more than one lexical format, e.g. 100 and 1.0E2 are both valid float formats representing "one hundred". However, rules have been established for this type that define a canonical lexical format, see XML Schema Part 2. (3) Newline, tab and carriage-return characters in a normalizedString type are converted to space characters before schema processing. (4) As normalizedString, and adjacent space characters are collapsed to a single space character, and leading and trailing spaces are removed. (5) The "g" prefix signals time periods in the Gregorian calender.
XM3014.1
Notes:
Always refer to the current release of the Specification and associated Primer(s), if any, for normative use of schema components.
B-2
Introduction to XML
V3.1.0.1
Student Notebook
AP
Pattern Facet
<schema xmlns='http://www.w3.org/2001/XMLSchema'> <simpleType name='DNAType'> <restriction base='string'> <pattern value='(A|G|T|C)+'/> </restriction> </simpleType> <element name='DNA' type='DNAType/> </schema> Valid XML: <DNA>AGCTATATACGGTAACGTA</DNA> Invalid XML: <DNA>AGTBCTGAEC</DNA> <DNA>10100011011</DNA>"
XM3014.1
Notes:
The pattern facet allows creation of a restriction of the string simple type by the specification of a regular expression. A regular expression specifies a set of strings using a pattern. Only string patterns that match the regular expression are valid instances of that data type. The value attribute of the <pattern> facet tag holds the regular expression. The regular expression syntax used by XML Schema is based on Perl regular expressions, but is not identical; the syntax has been extended to cope with Unicode characters and expressions on Unicode strings. In this example we're defining the new type 'DNAType" as a restriction of the string type. We're using the pattern facet as the constraint here. The value attribute of the <pattern> element is the regular expression (A|G|T|C)+. The meaning of this regular expression is "One or more occurrences (denoted by the + at the end) of A or (denoted by the vertical bar '|') G or T or C". There is a detailed explanation of the Schema regular expression language in Part 2 of the XML Schema specification at http://www.w3.org/TR/xmlschema-2.
B-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
All the simple types we've seen so far are 'scalar' types. That is, they are not decomposable into any smaller units. The next set of simple types we will look at are the aggregate data types, which can be broken down into smaller units. The first of these types is the list type. The list type allows us to define a simple type that contains a list of values. These values must be drawn from a single simple type, and the lexical space of that type must allow whitespace. The values in the list will be separated at whitespace boundaries (that is, wherever there is whitespace). Built in list types XML Schema contains a number of built in list types. All of these list types correspond to lists that were found in XML 1.0 DTDs. The full list of the built in list types is NMTOKENS, IDREFS, ENTITIES.
B-4
Introduction to XML
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
Here is a schema that shows how to declare a list type. We're going to use the 'DNAType' simpleType from the earlier visual. This schema declares a new element called 'DNASamples' which is a list of type 'dnaType'. The elements of a list are separated by whitespace. Note that this simpleType definition is not a restriction of any other base type. The invalid samples are invalid because they use something other than whitespace to separate list elements.
B-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
The union simple type doesn't correspond to any type in XML 1.0 DTDs. The union type allows you to create a new simple type whose instances must match the rules for one of the member simple types which is specified in the union. The union type in this creates an interval with a "hole". In this example, we define a new restriction of the integer type, which will have all the integer greater than five. Notice the use of the minExclusive facet to exclude the integer 5 from the value space of the new type. We then define another restriction of integers which consists of all the integers less than zero. Note again the use of maxExclusive to exclude 0 from the value space of the new type. We can now create a union type which will include both 'biggerThanFive' and 'lessThanZero', leaving a hole in the integers consisting of 0,1,2,3,4, and 5. Union type definitions are not confined to restrictions of the same base type. We could have also included a restriction of string in the union type, if we so desired.
B-6
Introduction to XML
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
This visual shows valid and invalid values for the resulting union type.
B-7
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Attribute Groups
Definition and use of an attribute group
<schema xmlns='http://www.w3.org/2001/XMLSchema'> <attributeGroup name='range'> <attribute name='minimum' type='integer' use='required'/> <attribute name='maximum' type='integer' use='required'/> </attributeGroup> <element name='gauge'> <complexType> <sequence> <element name='label' type='string'/> </sequence> <attributeGroup ref='range'/> </complexType> </element> </schema>
Valid XML:
<gauge minumum='0' maximum='90'> <label>Speed</label> </gauge>
Invalid XML:
<gauge minumum='0'> <label>Pressure</label> </gauge>
Figure B-7. Attribute Groups
Copyright IBM Corporation 2004
XM3014.1
Notes:
The attribute group component is relatively straightforward. The start and end tags for attributeGroup bracket the set of attribute group declarations that are to make up the group. The start tag provides an attribute for naming the attribute group. To actually use the attribute group, a complex type definition includes an attributeGroup element with a ref attribute that names the attribute group to be used.
B-8
Introduction to XML
V3.1.0.1
Student Notebook
AP
Annotation
Can be applied to elements, attributes, groups, attribute groups, simple types, complex types, wildcards. <schema xmlns='http://www.w3.org/2001/XMLSchema xmlns:ibmls='http://www.ibm.com/LS/types> <element name='widget' type='ibmls:widgetType'> <annotation> <documentation> This element is directly serializable into a Java class. </documentation> <appinfo> <java-serializer> com.ibm.ils.WD03.WidgetSerializer </java-serializer> </appinfo> </annotation> </element> </schema>
Copyright IBM Corporation 2004
XM3014.1
Notes:
XML Schema provides the <annotation> element for adding information about the schema components in an XML Schema document. The <annotation> element can have two children elements. The <documentation> element allows the author of a schema document to add human readable documentation to the component. The <appinfo> element allows annotations that are directed at computer programs that may process the schema. This may be a schema validator, or it may be another program. In the example on this visual, we are annotation an element declaration, and providing some human readable documentation. We also provide some information to a program that can take the schema information and use it to serialize and serialize the element as a Java class. Here our <appinfo> provides the name of a Java class that can do the serialization. This <appinfo> information can be processed by an application that takes the schema file and looks for the <appinfo> element associated with a particular component. The <annotation> element can be applied to elements, attributes, groups, attribute groups, simple types, complex types, wildcards, and is typically the first element to appear in the definition of one of these components.
Copyright IBM Corp. 2001, 2004 Appendix B. Additional Information for XML Schema B-9
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Terms (1 of 6)
abstract - must be implemented. anySimpleType - A conceptual datatype; the simple version of the ur-type definition from XML Schema Part 1: Structures. anySimpleType can be considered as the base type of all primitive types. The value space of anySimpleType can be considered to be the union of the value spaces of all primitive datatypes. anyType - see anySimpleType above and ur-type definition. assessment - Used to refer to the overall process of local validation, schema-validity assessment and infoset augmentation. attributeFormDefault atomic datatypes - Datatypes having values that are regarded as being indivisible. base type - Every datatype that is derived by restriction is defined in terms of an existing datatype, referred to as its base type. "base types" can be either primitive or derived. base type definition - A type definition used as the basis for an extension or restriction is known as the base type definition of that definition. blockDefault complexContent - contains only elements complexType element - These may allow elements; they may carry attributes. constraining facet - An optional property that can be applied to a datatype to constrain its value space. AKA non-fundamental facet.
Copyright IBM Corporation 2004
XM3014.1
Notes:
See the specifications for additional information. This material is provided as an aid only. AKA is also-known-as.
V3.1.0.1
Student Notebook
AP
Terms (2 of 6)
datatype - A 3-tuple, consisting of a) a set of distinct values, called its value space, b) a set of lexical representations, called its lexical space, and c) a set of facets that characterize properties of the value space, individual values or lexical items. declarations - Enable elements and attributes to appear in document instances; both simple and complex. definitions - Create new types; both simple and complex. derived datatypes - Those that are defined in terms of other datatypes. derived by list - A list datatype can be derived from another datatype (its itemType) by creating a value space that consists of a finite-length sequence of values of its itemType. derived by restriction - A datatype is said to be derived by restriction from another datatype when values for zero or more constraining facets are specified that serve to constrain its value space and/or its lexical space to a subset of those of its base type. derived by union - One datatype can be derived from one or more datatypes by unioning their value spaces and, consequently, their lexical spaces. elementFormDefault element substitution groups extension - A complex type definition which allows element or attribute content in addition to that allowed by another specified type definition is said to be an extension.
XM3014.1
Notes:
B-11
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Terms (3 of 6)
facet - A single defining aspect of a value space. Generally speaking, each facet characterizes a value space along independent axes or dimensions. Facets are of two types: fundamental facets that define the datatype and non-fundamental or constraining facets that constrain the permitted values of a datatype. finalDefault fundamental facet - An abstract property which serves to semantically characterize the values in a value space.
global element - An element that is a subelement of the schema/document/main/root element only; that is, one of the elements whose scope is immediately below that of schema, itself. infoset - See the infoset specification at: http://www.w3.org/TR/2001/WD-xml-infoset-20010316. itemType - The atomic datatype that participates in the definition of a list datatype is known as the itemType of that list datatype.
lexical space - the set of valid literals for a datatype. list datatypes - Datatypes having values each of which consists of a finite-length (possibly empty) sequence of values of an atomic datatype. A list datatype can be derived from another datatype (its itemType) by creating a value space that consists of a finite-length sequence of values of its itemType. NCName - Represents XML "non-colonized" (no :)Names.
XM3014.1
Notes:
See the specifications for additional information. This material is provided as an aid only.
V3.1.0.1
Student Notebook
AP
Terms (4 of 6)
NMTOKEN - NMTOKEN represents the NMTOKEN attribute type from XML 1.0 (Second Edition). The value space of NMTOKEN is the set of tokens that match the Nmtoken production in XML 1.0 (Second Edition). The lexical space of NMTOKEN is the set of strings that match the Nmtoken production in XML 1.0 (Second Edition). The base type of NMTOKEN is token. non-fundamental facet - see constraining facet. normalized value (of an element or attribute information item) - an initial value whose white space, if any, has been normalized according to the value of the whiteSpace facet of the simple type definition used in its validation: preserve - No normalization is done, the value is the normalized value; replace - All occurrences of #x9 (tab), #xA (line feed) and #xD (carriage return) are replaced with #x20 (space); collapse - Subsequent to the replacements specified above under replace, contiguous sequences of #x20s are collapsed to a single #x20, and initial and/or final #x20s are deleted. particle - element declaration, wildcard or model group. There are three varieties of model group: Sequence (the element information items match the particles in sequential order); Conjunction (the element information items match the particles, in any order); Disjunction (the element information items match one of the particles). primitive datatypes - Those that are not defined in terms of other datatypes; they exist ab initio. QName - Represents XML qualified names. The value space of QName is the set of tuples
{namespace name, local part}, where namespace name is an anyURI and local part is an NCName. The lexical space of QName is the set of strings that match the QName production of [Namespaces in XML].
Copyright IBM Corporation 2004
XM3014.1
Notes:
B-13
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Terms (5 of 6)
restriction - A type definition whose declarations or facets are in a one-to-one relation with those of another specified type definition, with each in turn restricting the possibilities of the one it corresponds to, is said to be a restriction. The specific restrictions might include narrowed ranges or reduced alternatives. Members of a type, A, whose definition is a restriction of the definition of another type, B, are always members of type B as well.
schema component - this is the generic term for the building blocks that comprise the abstract data model of the schema. schemata -
simple types - do not allow elements; may not carry attributes; e.g., a built-in type. simpleType element- The XML representation for a Simple Type Definition schema component is a <simpleType> element information item. TOKEN - Represents tokenized strings. The value space of token is the set of strings that do not contain the line feed (#xA) nor tab (#x9) characters, that have no leading or trailing spaces (#x20) and that have no internal sequences of two or more spaces. The lexical space of token is the set of strings that do not contain the line feed (#xA) nor tab (#x9) characters, that have no leading or trailing spaces (#x20) and that have no internal sequences of two or more spaces. The base type of token is normalizedString. Type Definition Hierarchy - Except for a distinguished ur-type definition, every type definition is, by construction, either a restriction or an extension of some other type definition. The graph of these relationships forms a tree known as the Type Definition Hierarchy. union datatypes - Datatype whose value spaces and lexical spaces are the union of the value spaces and lexical spaces of one or more other datatypes.
Copyright IBM Corporation 2004
XM3014.1
Notes:
See the specifications for additional information. This material is provided as an aid only.
V3.1.0.1
Student Notebook
AP
Terms (6 of 6)
ur-type definition - A distinguished ur-type definition is present in each XML Schema, serving as the root of the type definition hierarchy for that schema. The ur-type definition, whose name is anyType, has the unique characteristic that it can function as a complex or a simple type definition, according to context. Specifically, restrictions of the ur-type definition can themselves be either simple or complex type definitions. validation - the word valid and its derivatives are used to refer to determining local schema-validity, that is whether an element or attribute information item satisfies the constraints embodied in the relevant components of an XML Schema; value space - The set of values for a given datatype. Each value in the value space of a datatype is denoted by one or more literals in its lexical space.
XML Schema - A set of schema components.
XM3014.1
Notes:
See the specifications for additional information. This material is provided as an aid only.
B-15
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
V3.1.0.1
Student Notebook
AP
C-1
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Unit Objectives
Discuss Installation and migration Describe new features JDK 1.4.1 J2EE improvements Web Services changes List improvement for WebSphere Studio v5.1.1
XM3014.1
Notes:
For this presentation we will look at the main themes of WebSphere Studio v5.1.1. With these themes, there are also a number of new features which are spread throughout the tool. We will describe this new features in short detail.
C-2
Introduction to XML
V3.1.0.1
Student Notebook
AP
Install
If WebSphere Studio v5.1 is installed, v5.1.1 will automatically be installed over the existing installation. The install will remove the v5.1 installation before installing v5.1.1. Corresponding Third Party plug-ins for v5.1.1 will need to be installed. Upgrade to Remote Agent Controller v5.1.1 recommended.
XM3014.1
Notes:
If WebSphere Studio Application Developer Version 5.1 is detected by the installation program, WebSphere Studio Application Developer Version 5.1.1 will automatically be upgraded over Version 5.1. The install will remove the 5.1 installation before installing 5.1.1. This is transparent to the user during install. 5.1.1 has the same reg. keys as 5.1 5.1.1 replaces 5.1 in Add/Remove 5.1 no longer exists on your system after installing 5.1.1
C-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
One of the operating systems listed on this page must be installed before you install WebSphere Studio Application Developer v5.11. You will need a Web browser to view the readme files, the Installation Guide, and the Migration Guide. For information about supported database servers, Web application servers, and other software products, see the readme file located in the root of both the installation CD and the product installation directory.
C-4
Introduction to XML
V3.1.0.1
Student Notebook
AP
Migration
Migration Guide is included With Studio Application Developer Version 5.1.1, you can migrate code from 5.1 automatically Migrate code manually VisualAge for Java WebSphere Studio "Classic" WebSphere Studio Application Developer Version 4.0.x WebSphere Studio Application Developer Version 5 Beta, Early Availability, or General Availability WebSphere Studio Application Developer Version 5.0.1 WebSphere Studio Application Developer Version 5.1
XM3014.1
Notes:
If you install Application Developer in the default location the migration guide can be located here: C:\Program Files\IBM\WebSphere Studio\Application Developer\v5.1.1\migrate.html. WebSphere Studio Application Developer Version 5.1 can coexist with WebSphere Studio Application Developer Version 5.0.x or earlier. For coexistence, you can install into a different directory. WebSphere Studio Application Developer can coexist with other WebSphere Studio products. For instructions on safely migrating your existing projects from a previous version of WebSphere Studio Application Developer to Version 5.1.1, refer to the Migration Guide. As a precaution, it is recommended that you make a backup copy of your old workspaces prior to migrating to WebSphere Studio Application Developer Version 5.1.1.
C-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
New to v5.1.1 is the corresponding WebSphere v5.1 Test environment. While the version numbers may not match, WebSphere Studio v5.1.1 is the primary development tool for WebSphere Application Server v5.1 and includes specific wizards, editors, and tools to build applications which can utilize the latest support in WebSphere Application Server v5.1. The Optional Universal Test Environment provides added flexibility for users at install time who wish to deploy to particular servers. One the largest enhancements in WebSphere Studio v5.1.1 is the support for Java Server Faces. JSF is currently in beta, and is expected to be final in 1Q2004. An added enhancement to the Struts tools are that the Struts tags are now part of the Page Designer palette. For the Application Templates Wizard new code generators targeting new platforms are now available. They generate the following applications: A standard Struts 1.1 application, an Edit mode of a portal application to be tested and deployed on WPS, An XHTML-MP based application to target pervasive devices.
C-6 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
AP
A filter can be specified in the database connection dialog allowing users who have a database on an iSeries machine to connect to it. The PageDesigner has two new functions which are the support for WBMP file rendering, and a Page Template File Creation Wizard. The Web Site Designer has two new functions which are that you can now use the Object Palette to insert Web Site Designer objects, and you can now specify a Servlet URL. For the new JDBC error log, when a catalog import is launched from the DB Servers pane in the Data Perspective, an error dialog may be displayed upon completion. Though Application Developer 5.1.1 runs on JDK 1.3.1 it has support for JDK 1.4.1. WebSphere Studio Application Developer 5.1.1 has inherited updates from WebSphere Studio Workbench which is comprised of mostly bug fixes. WebSphere Studio Workbench 2.1.2 is a maintenance release to fix serious defects present in release 2.1.0 and 2.1.1. These changes only affect some plug-ins and features. Modified plug-ins have version id 2.1.2; plug-ins unchanged since the 2.1 release still have version id 2.1.0; plug-ins unchanged since the 2.1.1 release still have version id 2.1.1. Note, however, that all features now have version id 2.1.2 (even if none of their plug-ins changed). Maintenance release 2.1.2 includes all fixes made in 2.1.1 For a list of bug fixes go to: http://www.eclipse.org/eclipse/development/readme_eclipse_2_1_2.html#DefectsFixed The last main theme of WebSphere Studio v5.1 does not involve new wizards or tools, but performance improvements. Performance has be improved greatly in many use-cases and scenarios. Things such as installation and uninstall have been improved as well as startup and shutdown times. Many of the wizards, tools, and editors have been improved and will be noticeably faster to the end user.
C-7
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
The 5.1.1 release supports the following WebSphere Application Server Universal Test Environments (WTEs): WAS v4 WTE - v4.0.7 (JDK 1.3.1) WAS v5.0 WTE - v5.0.2 (JDK 1.3.1) WAS v5.1 WTE - v5.1 (JDK 1.4.1) WAS Express WTE - 5.0.2 WAS Express WTE - 5.1 In 5.1.1 all five WTEs (4.0.7, 5.0.2, and 5.1) are optional, therefore the JDKs and relevant WAS jars are available at compile time. The user decides which WTEs they want to install on their machine, and which versions, if any, they may just test via remote deploy. The reason for making all three WTEs optional is that we expect that soon after the WebSphere Studio Application Developer 5.1.1 release, most customers will be building/deploying WAS v5 apps, with only a few building JDK 1.4.1/WAS 5.1 apps. However, through the
C-8
Introduction to XML
V3.1.0.1
Student Notebook
AP
lifetime of WebSphere 5.1 and WebSphere Studio Application Developer release (at least three years), the mix will change. The implication of the WAS 5.0.2 and WAS 5.1 WTEs being optional is that we need to provide JDK and WAS jars needed at compile time in the Application Developer 5.1.1 install. Each runtime will have a stub directory which will contain the runtime jars needed to compile against, as well as the server configuration used for remote support. If a test environment is available, we'll use that for the server target (build path) and configuration instead, so these directories are really just used as backup. The new version of WAS WTE (v5.1) includes J9 support for hotcode replace + full speed debug.
C-9
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Java Server Faces is an emerging standard (JSR 127) that provides a GUI framework for developing J2EE applications. This new technology is the basis for our RAD experience in Application Developer. Application Developer contains a set of JSF components to improve usability and to enable a low end developer such as an HTML coder, JavaScripter, or Lotus Notes developer to build dynamic JSPs with minimal coding and a reduced level of Java language skills. JSF provides many built-in functions such as input validation, switching to an alternative markup renderer, maintaining client session state, error handling, and event handling. JSF and WDO tooling is in beta state in 5.1.1 due to the fact that the JSR 127 is not yet finalized. For that reason we will not support customer apps built with 5.1.1 JSF or WDO components in later versions. The components are labeled beta in the UI on the palette.
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
With the simple, well-defined programming model that JavaServer Faces technology provides, developers of varying skill levels can quickly and easily build Web applications by: assembling reusable UI components in a page, connecting these components to an application data source, and wiring client-generated events to server-side event handlers. With the power of JavaServer Faces technology, these web applications handle all of the complexity of managing the user interface on the server, allowing the application developer to focus on their application code.
C-11
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
EJB Client JAR support Server Targeting for J2EE projects which allows a project to get the JDK JARs and all public JARs from the server on its build path. EJB snippet support has been added to aid in generation of EJB client access code. Snippets are nice since they provide code with variable replacements. The variables each have descriptions and default values. There will be initially two EJB snippets: Call an EJB create method and Call and EJB find method. EJB Reference wizard support has been improved to allow for the creation of cross EAR EJB references. This enhancement also relates to the EJB Client JAR creation mechanism. The EJB Bottom Up mapping scenario has been improved to allow the running of the bottom up tooling when enterprise beans already exist.
V3.1.0.1
Student Notebook
AP
Minor Improvements/Enhancements
New Web Site Designer functions You can now use the Object Palette to insert Web Site Designer objects You can now specify a Servlet URL Struts tags in the Page Designer palette For Struts tools, the Struts tags are now part of the Page Designer palette. New JDBC error log When a catalog import is launched from the DB Servers pane in the Data Perspective, the following error dialog may be displayed upon completion
XM3014.1
Notes:
The Web Site Designer has two new functions: You can now use the Object Palette to insert Web Site Designer objects You can now specify a Servlet URL Struts tags in the Page Designer palette For Struts tools, the Struts tags are now part of the Page Designer palette. New JDBC error log When a catalog import is launched from the DB Servers pane in the Data Perspective, the following error dialog may be displayed upon completion: Problems encountered while importing from catalog. Reason: One or more problems were reported while accessing the database.
C-13
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Web Services
New Seoul Discovery Dialog WSDK v5.1- Command line tools have a refresh from WSAD 5.1 WSDL2Client command enables you to build client program from wsdl. (all you need is location of wsdl file) The WSDL can be local or on the Web WS-I in 5.1 you could set for whole workspace. In 5.1.1 you can set preference per project level Default is warning Update in Deployment Descriptor Model See Editor
XM3014.1
Notes:
The main enhancement for the Web Service tooling is the Seoul Discovery Dialog that can be used on a Java Server Page. The WSDL2Client tool generates Web service clients that are fully-deployable from one or more WSDL documents and optionally deploys them to the application server. To use this tool you need a WSDL file, the fully qualified path of which cannot contain a space, or the compile script will not run properly. The WS-I Basic Profile is a outline of requirements to which WSDL and Web service protocol (SOAP/HTTP) traffic must comply in order to claim WS-I conformance. In 5.1 you could set for whole workspace. In 5.11 you can set your preference per project level.
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
Heres a look at what is new in the language that underlies WebSphere 5.1. The Logging APIs used in SDK 1.4.1 are not integrated in WebSphere 5.1, but are available for customer code to use, and are expected to be supported in a future release of WebSphere. Full details on the specification can be found at the web site listed on the slide in the Java 2 SDK 1.4.1 specification: http://java.sun.com/j2se/1.4.1/docs/relnotes/features.html.
C-15
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Some JDK 1.4.1 APIs changed and added methods that conflict with some of the existing WebSphere APIs with the same method name and input signature, but with a different return type. Therefore some of the WebSphere APIs have been modified.
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
The Java API for XML processing has been added to the Java 2 Platform. It provides basic support for processing XML documents through a standardized set of Java Platform APIs. JDK 1.4.1 includes JAXP API and parser/transformer implementations. Versions of JAXP, SAX, DOM that are shipped with IBM JDK are: Xerces 4.2.2 JAXP 1.2, SAX 2.01, DOM 2. Versions of JAXP, SAX, DOM that are shipped with NON IBM JDK are: Crimson JAXP 1.1, SAX 2.0, DOM 2. Note: If you move from WebSphere v4.x to WebSphere v5.1, and you are using direct Apache Xerces APIs, this may change. This may have worked when going from WebSphere v4.x to WebSphere v5.0.X.
C-17
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
There are two new security features which include the Java GSS-API can be used for securely exchanging messages between communicating applications using the Kerberos V5 mechanism. The Java Certification Path API includes new classes and methods in the java.security.cert package that allow you to build and validate certification paths (also known as certificate chains). Due to import control restrictions, the JCE jurisdiction policy files shipped with the Java 2 SDK, v 1.4 allow strong but limited cryptography to be used. An unlimited version of these files indicating no restrictions on cryptographic strengths is available. The JSSE implementation provided in this release includes strong cipher suites. However, due to U.S. export control restrictions, this release does not allow alternate pluggable SSL/TLS implementations to be used. For more information, please see the JSSE Reference Guide.
V3.1.0.1
Student Notebook
AP
With the integration of JAAS into the Java 2 SDK, the java.security. Policy API handles Principal-based queries, and the default policy implementation supports Principal-based grant entries. Now access control can now be based not just on what code is running, but also on who is running it. Also, support for dynamic policies has been added. In Java 2 SDK releases prior to version 1.4, classes were statically bound with permissions by querying security policy during class loading. The lifetime of this binding was scoped by the lifetime of the class loader. In version 1.4 this binding is now deferred until needed by a security check. The lifetime of the binding is now scoped by the lifetime of the security policy. Finally the graphical Policy Tool utility has been enhanced to enable specifying a Principal field indicating what user is to be granted specified access control permissions.
C-19
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
There are 3 new tools to support Kerberos tickets. These tools help users obtain, list and manage Kerberos tickets. kinit - is a tool for obtaining Kerberos v5 tickets. klist - is a command-line tool to list entries in credential cache and key tab. Equivalent functionality is available on the Solaris operating environment via the klist tool. ktab - is a command-line tool to help the user manage entries in the key table.
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
The JDBC 3.0 API, comprised of packages java.sql and javax.sql, provides universal data access from the Java programming language. Using the JDBC 3.0 API, you can access virtually any data source, from relational databases to spreadsheets and flat files. JDBC technology also provides a common base on which tools and alternative interfaces can be built. New features include the ability to set savepoints in a transaction, to keep result sets open after a transaction is committed, to reuse prepared statements, to get metadata about the parameters to a prepared statement, to retrieve keys that are automatically generated, and to have multiple result sets open at one time. There are two new JDBC data types, BOOLEAN and DATALINK, with the DATALINK type making it possible to manage data outside of a data source. This release also establishes the relationship between the JDBC Service Provider Interface and the Connector architecture.
C-21
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Enhancements Since Version 1.4.0 DNS Service Provider Support for controlling timeouts when submitting UDP queries. Support for automatic discovery of DNS service. LDAP Service Provider Support for connection pooling. Support for automatic discovery of LDAP service via DNS. Support for use of multiple URLs for configuration.
V3.1.0.1
Student Notebook
AP
JVM Enhancements
Signal-chaining facility Error-reporting mechanism New command-line option for performing additional Java Native Interface (JNI) checks New facility for logging garbage-collection events
XM3014.1
Notes:
The Java virtual machines in this release include several enhancements. Signal-chaining facility. Signal-chaining enables the Java Platform to better interoperate with native code that installs its own signal handlers. Support for pre-installed signal handlers when the HotSpot VM is created. Support for signal handler installation after the HotSpot VM is created, inside JNI code or from another native thread. 64-bit support on Solaris-SPARC platform edition. Error-reporting mechanism The information provided by the new error-reporting mechanism will allow developers to more easily and efficiently debug their applications. If an error message indicates a problem in the JVM code itself, it will allow a developer to submit a more accurate and helpful bug report. New command-line option for performing additional Java Native Interface (JNI) checks.
Copyright IBM Corp. 2001, 2004 Appendix C. Whats New in WebSphere Studio V5.1.1 C-23
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
New facility for logging garbage-collection events. Rapid memory allocation and garbage collection provides for rapid memory allocation for objects, and it has a fast, efficient, state-of-the-art garbage collector. The Classic virtual machine is no longer shipped as part of the Java 2 SDK.
V3.1.0.1
Student Notebook
AP
AWT
New focus architecture New full-screen exclusive mode API supports high performance graphics by suspending the screen Headless support is now enabled that indicates whether a display, keyboard, and mouse can be supported in a graphics environment Mouse Wheel support Now 64-bit compliant
XM3014.1
Notes:
Changes to the AWT package center on improving the robustness, behavior, and performance of programs that present a graphical user interface. A new focus architecture replaces the previous implementation and addresses many focus-related bugs caused by platform inconsistencies, and incompatibilities between AWT and Swing components. The new full-screen exclusive mode API supports high performance graphics by suspending the windowing system so that drawing can be done directly to the screen; a benefit to applications like games, or other rendering-intensive applications. Headless support is now enabled by new graphics environment methods that indicate whether a display, keyboard, and mouse can be supported in a graphics environment. The ability to disable native frame decorations is now available for applications which need to take full control of specifying how a frame will look; when enabled this prevents the rendering of a native title bar, system menu, border, or other native operating system dependent screen components. The oft-requested mouse wheel, with a scroll wheel in place of the middle mouse button, is enabled with new built-in Java support for scrolling via the mouse wheel. Also, a new mouse wheel listener class allows customization of mouse
C-25
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
wheel behavior. The AWT package has been modified to be fully 64-bit compliant and now runs on Solaris machines with 64-bit and 32-bit addresses.
V3.1.0.1
Student Notebook
AP
Swing
New Spinner Allows a component user to select a number or a value by cycling through a sequence of values using a tiny pair of up/down arrow buttons New formatted text field Component allows formatting of dates, numbers, and strings New drag and drop architecture Provides seamless drag and drop support between components Progress bar Uses constant animation to show that a time-consuming operation is occurring Scrollable tabs Now supported in tabbed pane component Popup and popup factory Classes exposed and made public New Focus architecture Fully integrated into Swing
Copyright IBM Corporation 2004
XM3014.1
Notes:
Many new features have been added to Swing. The new spinner component is a single line input field that allows the user to select a number or a value by cycling through a sequence of values using a tiny pair of up/down arrow buttons. The new formatted text field component allows formatting of dates, numbers, and strings, such as a text field that accepts only decimal money values. The Windows look and feel implementation is updated to track features available in the 2000/98 versions. A new drag and drop architecture provides seamless drag and drop support between components as well as an easy way to implement drag and drop in your customized Swing components - writing a couple of methods which describe the particulars of your data model is all that is required. Swing's progress bar component has been enhanced to support an indeterminate state; rather than showing the degree of completeness, the indeterminate progress bar uses constant animation to show that a time-consuming operation is occurring. Due to great customer demand, the tabbed pane component has been enhanced to support scrollable tabs. With this feature enabled, if all the tabs will not fit within a single tab run, the tabbed pane component will display a single, scrollable run of tabs, instead of wrapping the tabs onto multiple runs. The popup and popup factory classes, which were previously
Copyright IBM Corp. 2001, 2004 Appendix C. Whats New in WebSphere Studio V5.1.1 C-27
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
package private, have been exposed and made public so that programmers may customize or create their own pop-ups. The new focus architecture is fully integrated into Swing.
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
The Java Logging APIs facilitate software servicing and maintenance at customer sites by producing log reports suitable for analysis by end users, system administrators, field service engineers, and software development teams. The Logging APIs capture information such as security failures, configuration errors, performance bottlenecks, and/or bugs in the application or platform. Logger: The main entity on which applications make logging calls. A Logger object is used to log messages for a specific system or application component. LogRecord: Used to pass logging requests between the logging framework and individual log handlers. Handler: Exports LogRecord objects to a variety of destinations including memory, output streams, consoles, files, and sockets. A variety of Handler subclasses exist for this purpose. Additional Handlers may be developed by third parties and delivered on top of the core platform.
C-29
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Level: Defines a set of standard logging levels that can be used to control logging output. Programs can be configured to output logging for some levels while ignoring output for others. Filter: Provides fine-grained control over what gets logged, beyond the control provided by log levels. The logging APIs support a general-purpose filter mechanism that allows application code to attach arbitrary filters to control logging output. Formatter: Provides support for formatting LogRecord objects. This package includes two formatters, SimpleFormatter and XMLFormatter, for formatting log records in plain text or XML respectively. As with Handlers, additional Formatters may be developed by third parties.
V3.1.0.1
Student Notebook
AP
Java Debugging
Full Speed Debugging Support HotSwap allows a class to be updated while under the control of a debugger Instance Filters Support For Debugging Other Languages VMDeathRequests
XM3014.1
Notes:
Full Speed Debugging Support In the previous versions when debugging was enabled, the program executed using only the interpreter. Now, the full performance advantage of is available to programs running with debugging enabled. The improved performance allows long running programs to be more easily debugged. It also allows testing to proceed at full speed and the launch of a debugger to occur on an exception. HotSwap has been added to allow a class to be updated while under the control of a debugger. EventRequests now have the capability of specifying an instance filter, which restricts the events generated by the request to those in which the currently executing instance is the object specified. There is now support For Debugging Other Languages.
C-31
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
The Java Platform Debugger Architecture has been extended so that non-Java programming language source, which is translated to Java programming language source, can be debugged in the future. With VMDeathRequests a request can now be made to control target VM termination notification, allowing clean shutdown synchronization. Using class VMDeathRequest, a request can be made for notification when the target VM terminates. When an enabled VMDeathRequest is satisfied, an EventSet containing a VMDeathEvent will be placed on the EventQueue.
V3.1.0.1
Student Notebook
AP
RMI
RMI Server-side Stack Traces Now Retained in Remote Exceptions Service Provider Interface for RMIClassLoader Dynamic Server Host Name Serialization Support for deserialization of objects that are known to be unshared in the data-serialization stream Support for a class-defined readObjectNoData method Important bug fixes
XM3014.1
Notes:
Server-side Stack Traces Now Retained in Remote Exceptions. The RMI runtime implementation will now preserve the server-side stack trace information of an exception that is thrown from a remote call, in addition to filling in the client-side stack trace as it did previous releases. Therefore, when such an exception becomes accessible to client code, its stack trace will now contain all of its original server-side trace data followed by the client-side trace. Service Provider Interface for RMIClassLoader. Certain static methods of java.rmi.server.RMIClassLoader now delegate their behavior to an instance of a new service provider interface, java.rmi.server.RMIClassLoaderSpi. This service provider object can be configured to augment RMI's dynamic class loading behavior for a given application. By default, the service provider implements the standard behavior of all of the static methods in RMIClassLoader.
C-33
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Javadoc 1.4.1
New Features Added package and class names as keywords in <META> tags. This should improve search results for search engines that look at <META> tags Bug Fixes About two dozen bug fixes Running Javadoc Error/Warning Messages Command Line Options Tags Miscellaneous
XM3014.1
Notes:
Fixed bug where it mistakenly documented .class files found on classpath (if they belonged to packages passed in on the command line). Fixed -use option, which was severely broken. Fixed -link option to handle absolute paths. Fixed -encoding option for reading source files. Fixed {@docRoot} which had been inserting an extra slash /. Fixed {@inheritDoc}, which did not work. Added interface constants to the Constant Field Values list.
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
There are some enhancements to all the areas listed on this chart. For additional details on each of these features, you can refer to the Java 2 SDK, Standard Edition, version 1.4.1 information online at http://java.sun.com/j2se/1.4.1/docs/relnotes/features.html
C-35
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Unit Summary
New Rapid Application Development Tools Support of Java Server Faces Build and manage Web site development rather than page development Numerous of additional improvements and enhancements present in WSAD v5.1.1 and JDK 1.4.1
XM3014.1
Notes:
WebSphere Studio v5.1.1 offers a number of new features, with the focus around new Rapid Application Development with Java Server Faces. There are also new Web site tool enhancements and noticeable performance improvements. You will also find many smaller enhancements which have been added throughout the product.
V3.1.0.1
Student Notebook
AP
D-1
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Welcome to:
Copyright IBM Corporation 2004 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
3.1
XM3014.1
Notes:
D-2
Introduction to XML
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
In the example: The id is a unique ID but is not required. The isbn number is required, but has no default value. The booktype is required (a validating parser will supply "Paperback" from the DTD's supplied default value). The storeloc attribute is not required, if it is not present, it would default to "5th Avenue", however here it was overridden by assigning the value "Times Square". The year attribute gets its fixed value from the DTD. No comment attribute was included; since this is an implied attribute, its use is optional. However, if it were used, it would change the start tag to <book isbn="1-56592-709-5" booktype storeloc="Times Square" year comment="A handy XML pocket reference.">
D-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
In order to understand attribute value types, you need to understand how attribute values are processed by an XML processor. When an XML processor processes a document it performs two operations on the values of attributes: 1. Attribute value normalization a. Replace ENTITY references with their replacement text (more on ENTITYs later). b. Convert space, tab, cr, lf character to a space. 2. Whitespace crunching The processor trims leading and trailing whitespace and replaces multiple #x20's (spaces) by a single one. *NMTOKEN (NaMeTOKEN) or Nmtoken are but two examples of a tokenized type. This is discussed in greater detail later. The URL is: http://www.w3.org/TR/1998/REC-xml-19980210
D-4 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
The URL for ISO/IEC 10646 is http://www.w3.org/TR/199/REC-xml-19980210#ISO 10646 Look at these terms as part of the vocabulary of XML. The precision is necessary, in part, because of the international nature of XML. The 1998 XML 1.0 specification literally defines whitespace as S where S::(#x20 | #x9 |#xD |#xA)+ where the | represents or and the + means at least one. These terms permeate the XML vocabulary. We'll have more to say about them throughout this course. The specification uses these categories to catalog attribute types. The categories and their contents are covered on subsequent charts.
D-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Declaration:
<!ELEMENT box EMPTY> <!ATTLIST box height CDATA #REQUIRED> <!ATTLIST box width CDATA #REQUIRED>
or
<box height="#32" width="1#@%^ 2"/>
XM3014.1
Notes:
D-6
Introduction to XML
V3.1.0.1
Student Notebook
AP
Declaration:
<!ELEMENT employee (#PCDATA)> <!ATTLIST employee serialNumber NMTOKEN #REQUIRED>
XM3014.1
Notes:
This foil shows a declaration for a required attribute of type NMTOKEN. The valid example is valid because all the serialNumber values are proper name tokens; that they are identical is irrelevant. The first invalid example is invalid because # is not a valid name token so #00001 is also invalid. The second example is invalid because of the whitespace between the first and second characters.
D-7
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Declaration:
<!ELEMENT employee (#PCDATA)> <!ATTLIST employee serialNumber NMTOKENS #REQUIRED>
XM3014.1
Notes:
This foil shows a declaration for a required attribute of type NMTOKENS. The valid example is valid because all the serialNumber values are proper name tokens; that they are identical is irrelevant. The difference between this chart and the previous chart is that the plural nature of NMTOKENS now permits interspersing the name tokens with whitespace. The invalid example is invalid because # is not a valid name token.
D-8
Introduction to XML
V3.1.0.1
Student Notebook
AP
Declaration:
<!ELEMENT employee (#PCDATA)> <!ATTLIST employee serialNumber ID #REQUIRED>
The rules for XML names requires that the 1st character not be a number. Valid XML fragment:
<employee serialNumber="e00001">Joe Smith</employee> <employee serialNumber="e00002">Bill Smith</employee> <employee serialNumber="_00002">John Smith</employee>
XM3014.1
Notes:
This foil shows a declaration for a required attribute of type ID. According to the syntax rules for ID's, numbers cannot be ID's. That is why the serialNumber values begin with a letter. The valid example is valid because all the serialNumber values are distinct even though the second and third values have the same numerical part. Arguably the numerical overlap may be unintentional, perhaps from the merger of two departments each with a similar XML. The invalid example is invalid because the first two employee's have the same serialNumber values and the third employee's serial number starts with an illegal character.
D-9
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Declaration:
<!ELEMENT <!ATTLIST <!ATTLIST <!ATTLIST employee employee employee employee (#PCDATA)> serialNumber ID #REQUIRED> manager1 IDREF #IMPLIED> manager2 IDREF #IMPLIED>
XM3014.1
Notes:
This foil shows a declaration for an implied attribute of type IDREF. According to the syntax rules for IDs, numbers cannot be ID's. That is why the serialNumber values begin with a letter. Aside from naming rules, managerN could have any value as long as there is an element with that value defined. Consequently, an employee could be self-managed! The uniqueness constraint applies to IDs not to IDREFs so the employee could be self-managed twice: both manager1 and manager2 could have the same value.
V3.1.0.1
Student Notebook
AP
Declaration:
<!ELEMENT shirt(#PCDATA)> <!ATTLIST shirt size (small|medium|large) #REQUIRED>
Valid XML:
<shirt size="small">plaid polyester</shirt> <shirt size="large">white poplin</shirt>
Invalid XML:
<shirt size="XXL">navy pullover</shirt>
XM3014.1
Notes:
This foil shows an example of an attribute with an enumerated value. Valid documents will only have values listed in the list of values. The valid examples both take their size values from the list of small, medium, or large.
D-11
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
What Is an Entity?
"Reasonable people may disagree: many of us believe...that entities are simply slightly-constrained storage units with no other special significance..." David Megginson An entity can be internal (to the XML instance before you) or external (to the XML instance before you) If it is external, that is a good reason to use the standalone attribute in the XML Declaration statement (standalone="no") An entity can be a parsed or unparsed entity parsed entities
Reference textual content only Get replaced by the actual content during parsing Can be used anywhere in the document
unparsed entities
Can reference any type of content Must be associated with a NOTATION Parser only passes entity and notation data to the application Can only be used as an attribute value, not valid as element content
XM3014.1
Notes:
David Megginson's work in XML is well-known. Another common analogy likens an entity to a macro, on the one hand, and a constant, on the other hand. This arguably better captures the internal/external duality of entities. Go back to the ENTITYattribute.xml example and see what error you receive if you delete the NDATA declaration from the ENTITY statement. (This breaks the tie to the NOTATION statement.) The following charts further develop these concepts.
V3.1.0.1
Student Notebook
AP
Usage: To use an ENTITY, place a reference to the entityName in the XML document. References to the entityName are replaced with the replacement text. ENTITYs can be declared internal to the file or external using a URI in place of the "replacementText".*
<!ENTITY entityName SYSTEM "URI">
XM3014.1
Notes:
An XML document can be composed of many storage units. These storage units are called entities. The document is composed by combining all of the storage units. There are different names and rules for entities that are used in the document proper and for entities which appear in the DTD. The next group of charts examine the various combinations available in a DTD to introduce additional information.
D-13
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Parsed Entities
Parsed entities Reference textual content only Get replaced by the actual content during parsing Can be used anywhere in the document
XM3014.1
Notes:
Generally speaking, an entity is a reference to content. The entity reference is really just a placeholder for the content it refers to. When the parser replaces the entity name with the replacement text, it does the replacement, and then continues parsing as normal. This means that the replacement text is actually parsed AFTER the entity replacement is done. If your entity content contains XML, then it must be well-formed (and valid if doing a validating parse). If the replacement text contains the characters < or &, they must be escaped. As a best practice, >, ', and " should also be escaped. If the replacement text contains XML markup, it must be well-formed/valid. Caution: when escaping characters, always be careful not to lose the ; because of its small size relative to other characters.
V3.1.0.1
Student Notebook
AP
Built-in Entities
The five built-in character entities in XML:
entity < > " ' & description "less than" "greater than" "quote" "apostrophe" "ampersand" character < > " ' &
Examples:
  = space   = space π = π =
XM3014.1
Notes:
XML also contains five built-in entities used to refer to characters that are reserved by XML. Without the built-in entities, if you write a '<" less than sign in a document, the processor may be unable to determine whether you are trying to start a tag or not. There are places where you can use the actual character instead of the built-in entity, but we recommend always using the built-in entity, except for CDATA, to avoid problems. As a best practice, these entities should be defined in your DTD, and any use of the characters should be escaped. However, in practice, only the & always needs to be escaped. In attribute values, the ", and ' also needs to be escaped. In element content the < also needs to be escaped. This is a good place to mention character references as well. XML is based on Unicode, and there are more characters in the Unicode character set than there are on most keyboards in the world. In order to allow you to enter any Unicode character, XML provides a mechanism for character references. Character references look very similar to entity references, except that they begin with "&#" instead of "&". You can specify the Unicode character you want by providing the base 10 number of the character between the &# and ;
Copyright IBM Corp. 2001, 2004 Appendix D. Additional Information and Examples D-15
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
By using &#x instead of &#, you can provide a hexadecimal number instead of a decimal number. When the XML processor encounters the character reference, it will insert the proper Unicode character. 960 is the math character "pi" (3.141...); x03c0 is its hexadecimal equivalent.
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
XML provides notations as a way of describing unparsed data - frequently this is data that is not XML, such as binary data. Notation names appear as the value of NOTATION valued attributes and as part of the specification of an unparsed entity. The declaration for a notation follows in the next chart. Along with the name of the notation, this is really the only support for processing notation data.
D-17
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Definition: Unparsed entities Can reference any type of content Must be associated with a NOTATION Parser only passes entity and notation data to the application Can only be used as an attribute value, not valid as element content Always external to the document Syntax:
<!ENTITY entityName SYSTEM "systemURI" NDATA notationName> <!ENTITY entityName PUBLIC "publicURI" "systemURI" NDATA notationName>
Copyright IBM Corporation 2004
XM3014.1
Notes:
Another way to declare an unparsed entity is <!ENTITY entityName PUBLIC "publicURI" "systemURI" NDATA notationName>. In this case, the parser must understand the publicURI. When processing an unparsed entity or notation, the parser does not actually go out and retrieve the information. Rather, it parses the entity and/or notation information to the application. It is the application's responsibility to determine how to use that information to process the unparsed entity data. An unparsed entity is always external to the document. Unlike the other declaration structures, when using a notation that references a public URI, a system URI does not always need to be provided. This is because the notation information is not actually processed by the parser. It is meant only has a hint for the application to use in order to process the associated data. One way to look at this is that a notation provides type information, and an unparsed entity references an external file of that type.
D-18 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
AP
An example follows.
D-19
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Valid XML:
<consultant companyAtts="company location" resume="emp1.pdf" resumefrmt="pdf">Scott Karabin </consultant>
Copyright IBM Corporation 2004
XM3014.1
Notes:
Here is an example. In practice, the ENTITY declarations would most likely appear before its associated NOTATION. However, in a large complex DTD with many references to unparsed entities -example, a library-type application -- it is advisable to put all the NOTATIONs in a separate section and use comments to mark the section.
V3.1.0.1
Student Notebook
AP
Valid XML:
<person> <name>Kelly Brown</name> <picture filename="kbrown.jpg" picformat="jpeg"/> </person>
XM3014.1
Notes:
In the example, the notation-typed picformat attribute specifies "jpeg" as its value. An application can ask the XML parser for the helper application associated with the jpeg notation, and will get back the "photoshop.exe" value, which it can the use in processing the data. Again, the parser would perform no validation; it simply passes on the attribute's value. It is up to the application to handle the information it has received. The NOTATION typed attribute does not provide the actual unparsed data -- it provides information on how to process the unparsed data.
D-21
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Welcome to:
Copyright IBM Corporation 2004 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
3.1
XM3014.1
Notes:
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
For additional information as a schema component see http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ Section 2.2.1.2 Simple Type Definition. For detailed information on simple type definitions, see http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ Simple Type Definitions (3.14) and [XML Schemas: Datatypes]. The latter also defines an extensive inventory of predefined simple types.
D-23
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
Additional information in http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/ Section 2.5.3 "Built-in versus user-derived datatypes."
D-25
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Applicable Facets
Two categories of simple types from the XML perspective: Ordered Non-ordered These facets apply to both: Length minLength maxLength pattern enumeration whiteSpace The ordered simple types are Byte unsignedByte Integer positiveInteger negativeInteger nonNegativeInteger nonPositiveInteger int unsignedInt long unsignedLong short unsignedShort Decimal float double Time dateTime duration date gMonth gYear gYearMonth gDay gMonthDay These facets apply only to ordered, simple types maxInclusive maxExclusive minInclusive minExclusive totalDigits fractionDigits
Copyright IBM Corporation 2004
XM3014.1
Notes:
You are advised to refer to the Specifications and Primer for examples of how to implement facets. Our examples are necessarily limited by time and space.
V3.1.0.1
Student Notebook
AP
Mixed Add mixed="true" attribute on complexType element and then declare as with Element Only content model
<xsd:complexType name="mixedType" mixed="true"> <xsd:sequence> <xsd:element name="firstName" type="xsd:string"/> <xsd:element name="lastName" type="xsd:string"/> </xsd:sequence> </xsd:complexType>
XM3014.1
Notes:
The Element only content model can only contain elements as children to declare this content model use a model group or compositor as the child of the <complexType> element. Mixed (elements plus character data) add mixed='true' attribute on complexType (and provide a model group/compositor. The Mixed model in Schema is a little different. The order and number of child elements matters.
D-27
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
For additional information as a schema component see http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ Section 2.2.1.3 Complex Type Definition. For detailed information on complex type definitions, see http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ Complex Type Definitions (3.4).
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
D-29
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
If you use this form, you will use a name= For example, po2.xsd file: <xsd:complexType name="Address"> In the file we added abstract="false" but from the above you can see that that is the default; it could be omitted.
Copyright IBM Corporation 2004
XM3014.1
Notes:
Any time you see language like "this is the specification. . ." it means that whatever immediately follows is directly from the W3C XML Schema specification. We used bolding to draw attention to the two part nature of this specification: The top part, which ends with the > at the end of the 8th line would include the attributes associated with this complexType. The remainder carries the content, which can include subelements, attributes, and the other choices shown. When a complex Type is declared "abstract" it cannot be used in an instance document. If you change the value of abstract in <xsd:complextType name="Address" abstract="false"> to "true" you will receive an error. But not until you try to validate the associated instance document. And then the complaint will be about billTo. Even though there is no element in the instance by the name of Address its definition is used to define the contents of the billTo element.
V3.1.0.1
Student Notebook
AP
Reverting the value to "false" and saving the xsd file does NOT clear the error in the .xml file: the error will not be cleared until you rerun the validator on the instance file (or make some insignificant change like adding, then removing, a blank space and then "save" the result).
D-31
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Any reference to "Studio does this or that" indicates that whatever follows came from either the Help plug-in or the help built-in to the source editor(s). Recall that an NCName is a "non-colonized" name. That is, no :. A complexType that has no name= is known as an anonymous complex type definition.
V3.1.0.1
Student Notebook
AP
Declaration Components
There are three kinds of declaration component: Element Attribute Notation Each is described in a section that follows. Also included is a discussion of element substitution groups. This is a feature provided in conjunction with element declarations. Recall from the chart that in categorized schema components: Element and attribute declarations are considered primary schema components Notation is a secondary schema component
XM3014.1
Notes:
This is covered in Section 2.2.2 in Part 1 of the Specification.
D-33
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
Part 1 of the Specification Section 2.2.2.1 Element Declaration For detailed information on element declarations, see Element Declarations (3.3).
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
For detailed information on element declarations, see Element Declarations (3.3) in the specification.
D-35
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
How these information items are employed depends on where they occur in the schema . . .:
Copyright IBM Corporation 2004
XM3014.1
Notes:
V3.1.0.1
Student Notebook
AP
Global Element (1 of 2)
An element that has been declared under schema (that is, the root element) and not as part of a complex type defintion. This is how Studio sees it Notice Studio uses two views to help us
XM3014.1
Notes:
Once we have defined a global element we can refer to it (ref=) as often as necessary.
D-37
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Global Element (2 of 2)
Additional information (and hints) can be obtained by using a combination of the hover help and opening a pick list.
Hover help: here it defines the term associated with the highlighted item Drop down pick list to identify legal choices
XM3014.1
Notes:
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
Refer to http://www.w3.org/TR/xmlschema-1/#cIdentity-constraint_Definitions for additional details.
D-39
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
V3.1.0.1
Student Notebook
AP
3.3 Particles
As described in Model Groups, particles contribute to the definition of content models. Example. XML representations which all involve particles, illustrating some of the possibilities for controlling occurrence:
<xsd:element ref="egg" minOccurs="12" maxOccurs="12"/> <xsd:group ref="omelette" minOccurs="0"/> <xsd:any maxOccurs="unbounded"/>
XM3014.1
Notes:
D-41
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
3.4 Wildcards
In order to exploit the full potential for extensibility offered by XML plus namespaces, more provision is needed than DTDs allow for targeted flexibility in content models and attribute declarations. A wildcard provides for validation of attribute and element information items dependent on their namespace name, but independently of their local name. Example. XML representations of the four basic types of wildcard, plus one attribute wildcard:
<xsd:any processContents="skip"/> <xsd:any namespace="##other" processContents="lax"/> <xsd:any namespace="http://www.w3.org/1999/XSL/Transform"/> <xsd:any namespace="##targetNamespace"/> <xsd:anyAttribute namespace="http://www.w3.org/XML/1998/namespace"/>
XM3014.1
Notes:
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
We also need to tell the schema processor that the default form for element names is that they be qualified. We do this using the elementFormDefault attribute.
D-43
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
By setting the attributeFormDefault to unqualified we are explicitly stating we recognize that attributes do not need to be namespace-qualified.
XM3014.1
Notes:
We also need to tell the schema processor that the default form for attribute names is that they be qualified. We do this using the elementFormDefault attribute. If we do nothing they will not need to be qualified. If we set it to unqualified we are indicating we have thought about it and made this decision. You can test this in the lab exercise for this lecture.
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
D-45
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
XM3014.1
Notes:
V3.1.0.1
Student Notebook
AP
Redefinition of imported types You want to use some, but not all of a set of imported types. Nil Values When you need to represent the absence of something as a value itself. When migrating from a database with null values. Uniqueness / Identity Constraints You need to specify that a set of things is unique. You need to make sure that a set of things is keyed to a second set of unique items.
Copyright IBM Corporation 2004
XM3014.1
Notes:
Here is a list of the major topics that we didn't have time to cover in this course. Type derivation XML Schema supports programming language-like inheritance. Redefinition of imported types When importing types from another namespace it is also possible to redefine or extend via inheritance the types being imported. Nil Values As we alluded to when we talked about the XML Schema Instance namespace, XML Schema supports a notion of nil (null) values. Uniqueness / Identity Constraints XML Schema allows the specification of Uniqueness and identity constraints using a subset of the XPath specification.
D-47
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
In addition, we were unable to cover all the details of the features that we presented today.
V3.1.0.1
Student Notebook
AP
Status
XMLSchema V1.0 - W3C Recommendation as of 5/2/2001 Three parts to specification:
Part 0: Primer Part 1: Structures Part 2: Datatypes
XMLSchema V1.1 - Work is underway to assemble requirements for this revision which is intended to remain mostly compatible with 1.0, fix bugs and make minor improvements. Visit http://www.w3.org/XML/Schema for more information.
XM3014.1
Notes:
XML Schema was accepted as a W3C recommendation on 5/2/2001. This means that the specification should be considered stable, and that implementors will begin producing schema processors compliant with the recommendation. This is the version that you should deploy or consider deploying. All previous working drafts, candidate recommendations, or proposed recommendations are superseded by the recommendation. The XML Schema recommendation documents consist of three parts: Three parts to spec Part 0: Primer (This is a tutorial introduction to the features of XML Schema) Part 1: Structures (This document covers everything except the specific simple types provided by XML Schema). Part 2: Datatypes (This document covers each of the simple types supported by the XML Schema recommendation).
D-49
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Moreover, XML Schema will be a part of the foundation for the next generation of W3C technologies. XSLT 2.0, XPath 2.0, and XQuery are among the technologies that will use XML Schema as a foundation to build on. Foundation for revisions of W3C technologies: XSLT 2.0 XPath 2.0 XQuery
V3.1.0.1
Student Notebook
AP
Tooling (1 of 2)
Can use any text editor As long as the editor supports Unicode or the chosen encoding IBM WebSphere Studio 4+ (all editions) Guided document editing for documents based on XMLSchemas (or DTDs) Generate an XMLSchema from a DTD Syntax aware XMLSchema Editor Include Xerces-J 1.4.2 Apache Software Foundation Xerces-J 1.4+ (REC) University of Edinburgh XSV (REC) Oracle Oracle XML Parser (CR)
XM3014.1
Notes:
As of (today) there is limited support for XML Schema in XML parser implementations. Here is a list of the implementations available. Apache Software Foundation Xerces-J 1.4+ (REC) This parser is written in Java and forms the basis for IBM's XML Parser for Java product. Aside from bugs, Xerces 1.4 supports the recommendation syntax for XML Schema. University of Edinburgh XSV (REC) This parser is written in Python and supports the Proposed Recommendation Syntax for XML Schema. Oracle Oracle XML Parser (PR) This parser is written in Java and supports the proposed recommendation syntax for XML Schema. Microsoft
Copyright IBM Corp. 2001, 2004 Appendix D. Additional Information and Examples D-51
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
MSXML 4.0 (PR) This parser is written in C/C++ and supports the proposed recommendation syntax for XML Schema.
V3.1.0.1
Student Notebook
AP
Tooling (2 of 2)
Microsoft MSXML 4.0 (REC) Free IBM Alphaworks tools to help you: XML Schema Quality Checker Visual DTD
Editing environment with syntax-directed help. Found in the package called "Visual XML Tools"
XM3014.1
Notes:
Here are some other tools that may be helpful as you work with XML Schemas.
D-53
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Welcome to:
Copyright IBM Corporation 2004 Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
3.1
XM3014.1
Notes:
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
Variables - used to declare a local or global variable in a stylesheet <xsl:variable name="country" select="germany"/> will get the element <germany> <xsl:variable name="country" select="'germany'"/> will set country to 'Germany' Parameters - to describe a global parameter.
D-55
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Parameters Example
<Named> <AAA repeat="3"/> <BBB repeat="2"/> <CCC repeat="5"/> </Named>
<xsl:template match="/Named/*"> <p> <xsl:call-template name="while"> <xsl:with-param name="test"> <xsl:value-of select="@repeat"/> </xsl:with-param> </xsl:call-template> </p> </xsl:template> <xsl:template name="while"> <xsl:param name="test"/> <xsl:value-of select="name()"/> <xsl:text> </xsl:text> <xsl:if test="not($test = 1)"> <xsl:call-template name="while"> <xsl:with-param name="test"> <xsl:value-of select="$test - 1"/> </xsl:with-param> </xsl:call-template> </xsl:if> Transformation.xsl </xsl:template>
Input.xml
<p>AAA AAA AAA </p> <p>BBB BBB </p> <p>CCC CCC CCC CCC CCC </p>
Output.html
XM3014.1
Notes:
V3.1.0.1
Student Notebook
AP
Variable Example
<list> <book ID="666"> <chapter>First Chapter</chapter> <chapter>Second Chapter</chapter> <chapter>Third Chapter</chapter> </book> </list>
Input.xml
<xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version="1.0"> <xsl:variable name="totalChapters"> <xsl:value-of select="//chapter[last()]" /> </xsl:variable> <xsl:template match="/"> <xsl:value-of select="$totalChapters" /> </xsl:template> </xsl:stylesheet>
Third Chapter
Output.html
Transformation.xsl
Copyright IBM Corporation 2004
XM3014.1
Notes:
D-57
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
No Reassigning Variables
Variables cannot be reassigned. We want pure, side-effect free functions which are not dependent upon order of execution. Conditional Initialization. Want to assign a variable based on a condition.
<xsl:variable name="oddOrEven"> <xsl:choose> <xsl:when test="even(current())">even</xsl:when> <xsl:otherwise>odd</xsl:otherwise> </xsl:choose> </xsl:variable>
Often variables are used for doing two tasks at once. Divide the templates into two templates. Example, calculate min and max of a list of nodes. No counters or counted for-loops. Write recursive template instead.
XM3014.1
Notes:
Functions/templates in XSLT are independent of each other and do not have any external dependencies. They can be called in any order against the same XML file and the same results will be obtained. Variable reassignment is not allowed to avoid creating dependencies among functions. In a functional programming language, you would calculate the min and max by creating 2 variables and looping through the nodes, reassigning the min and max based on tests against the node. In XSLT, you should use recursive templates instead of looping.
V3.1.0.1
Student Notebook
AP
XM3014.1
Notes:
In this example, a common template has been developed for input fields. The template tests will output the value passed in as a simple string if the parameter for isReadOnly is true. If isReadOnly is fals, an input field with the value shown will be output. This example also has the controlName and the value being passed in as parameters. The values for size and class have been hardcoded. So this template is only good for cases where 20 is the correct size for the input field and the class in inputField. To make this template more flexible you would add parameters for class and size.
D-59
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Input.xml
<xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version="1.0"> <xsl:include href="CommonTemplates.xsl" /> <xsl:template match="/author"> <xsl:call-template name="controlInput"> <xsl:with-param name="controlName" select="position()"/> <xsl:with-param name="value" select="."/> <xsl:with-param name="isReadOnly"> select="@readOnly"/> </xsl:call-template> </xsl:template> </xsl:stylesheet>
Output.html
<xsl:stylesheet xmlns:xsl='http://www.w3.org /1999/XSL/Transform' version="1.0"> <xsl:template name="controlInput"> ... see previous page </xsl:template> </xsl:stylesheet>
Transformation.xsl
Copyright IBM Corporation 2004
CommonTemplates.xsl
XM3014.1
Notes:
The first author, Dr. Smith, is output as a text string since it is readOnly. The second author, Elton John, results in an input field since it is not readOnly. The common named template is kept in a separate file that can be included in other files and the named templates can then be called by those other templates. Three parameter values are needed by the controlInput template; controlName, value, and isReadOnly. These parameter values are chosen from the current author node. The controlName is taken from the position number - we want it to be unique for each author node. The value is the value of the author element. The value of readOnly is taken from the attribute isReadOnly. You can easily call the same controlInput template for a different element in the XML as long as you can determine values for controlName, value, and isReadOnly.
V3.1.0.1
Student Notebook
AP
XSL
XML Stream
Style Sheet
query Server Client Browser support: Netscape 6.x and IE 6.0 have W3C-Compliant XSL transformation capability IE 5.0 and 5.5 provide support only for the last W3C working draft The W3C has a browser/editor (Amaya) that supports rendering of XML via CSS, or via hierachal view
XM3014.1
Notes:
Since we push XML directly to the client, the raw XML is available for further processing on the client (for example, export to a spreadsheet or local database). To make IE5.x it compliant: You need to download the latest MSXML3. As long as your XML file refers to the appropriate .XSLT stylesheet, it should render in IE5. If you are running side-by-side installation of MSXML, IE will use the old XSLT processor. To unregister the old processor and tell IE to use the new one, type the following four commands at a command prompt: regsvr32 msxml3.dll xmlinst Note: In the final release, xmlinst.exe is a separate download from MSDN, and does not come with the MSXML download. You can download xmlinst.exe here.
D-61
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
More information about installing for replace mode is found in the online documentation that comes with the updated parsers. You may also wish to download the "Internet Explorer Tools for Validating XML and Viewing XSLT Output". Also IE5 does not validate the XML documents. And doesn't let you view the source of the XSL Stylesheet. The following link points to an update that validates the XML document and enables viewing the XSL Stylesheet. http://msdn.microsoft.com/code/default.asp?url=/code/sample.asp?url=/msdn-files/027/00 0/543/msdncompositedoc.xml Future versions of IE (V6) and Mozilla are supposed to have XSLT support.
V3.1.0.1
Student Notebook
AP
XSL
SQL Translator
HTML Stream
SQL
Style Sheet
Client
Middle-tier Server
Xerces/Xalan (Apache) Xerces/Jigsaw (Apache/W3C)
XM3014.1
Notes:
XSLT Processor running on the Web Server.
D-63
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
V3.1.0.1
Student Notebook
AP
Books
Network Computing Framework Component Guide IBM Redbook
SG24-2119
Books
Lotus Notes Release 4.5: A Developer's Handbook IBM Redbook
Appendix E. Bibliography and References E-1
SG24-4876
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Lotus Notes and Domino Server 4.5 Unleashed Lotus Notes and Domino Server 4.6 Unleashed IBM Network Station Guide for Windows NT IBM Redbook RS/6000 - IBM Network Station Guide A Companion Guide IBM Redbook
Books
The Lotus Domino server (OS/390) IBM Redbook
SG24-2083
E-2
Introduction to XML
V3.1.0.1
Student Notebook
AP
Books
Developing Web Applications Using Lotus Notes Designer for Domino 4.6 IBM Redbook Programming with VisualAge for Java 1.0 IBM Redbook VisualAge for Java Enterprise Version2: Data Access Beans - Servlets - CICS Connector IBM Redbook
www.software.ibm.com/webservers/connectors www.software.ibm.com/ts/lotus_connections
Appendix E. Bibliography and References E-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
MQSeries MQSeries for Windows NT MQSeries Security White Paper Net.Data eNetwork Host On-Demand Gateway CICS GW for Java & CICS Client IMS IMS Web DCE Encina Lightweight Client Domino.Connect/Notes Pump
www.software.ibm.com/ts/mqseries www.software.ibm.com/ts/mqseries/platforms/nt www.software.ibm.com/ts/mqseries/txppacs/ms06.html www.software.ibm.com/data/net.data www.software.ibm.com/enetwork/hostondemand CICS Internet www.software.ibm.com/webservers/connectors www.software.ibm.com/webservers/connectors www.software.ibm.com/data/ims/whatsnew.html www.software.ibm.com/data/ims/imsweb.html www.transarc.com/Product/TXSeries/DELight2.0/index 11.html www.edge.lotus.com/eibu_knowbase.nsf
Books
Web Gateway Tools: Connecting IBM & Lotus Applications to the Web Summary at www.ibm.com/technology/books/webgate Internet Application Development with MQSeries and Java IBM Redbook MQSeries Security: Example of Using a Channel Security Exit, Encryption and Decryption IBM Redbook
SR23-7862
SG24-4896
SG24-5306-00
E-4
Introduction to XML
V3.1.0.1
Student Notebook
Developing Distributed Transaction Applications with Encina IBM Redbook Revealed! CICS Transaction Gateway with More CICS Clients Unmasked IBM Redbook CICS Clients Unmasked IBM Redbook Accessing CICS Business Applications from the World Wide Web IBM Redbook Connecting IMS to the World Wide Web: A Practical Guide to IMS Connectivity IBM Redbook Lotus Solutions for the Enterprise IBM Redbook 60 Minute Guide to LotusScript 3 Lotus Solutions for the Enterprise, Volume 2, Using DB2 in a Domino Environment IBM Redbook Lotus Solutions for the Enterprise, Volume 1-5 Enterprise Integration with Domino for S/390 IBM Redbook
E-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Books
Integrating Net.Commerce with Legacy Applications IBM Redbook
SG24-4933
Books
Managing Access from Desktop to Datacenter: Introducing TME IBM Redbook Measuring Lotus Notes Response Times with Tivoli's ARM Agents IBM Redbook Managing a Notes Environment w/ TME 10 Module for Notes IBM Redbook TME 10 Deployment Cookbook: Inventory IBM Redbook
E-6
Introduction to XML
V3.1.0.1
Student Notebook
AP
Java Shareware Java Applets (Gamelan) Lotus eSuite CORBA Sun JavaBeans Page Sun JavaBeans Directory Enterprise JavaBeans San Francisco What IBM Is Doing With Java IBM WebSphere Application Server
www.javashareware.com www.gamelan.com esuite.lotus.com www.omg.org java.sun.com/beans java.sun.com/beans/directory www.javasoft.com/marketing/enterprise/index.ht ml www.ibm.com/java/sanfrancisco www.ibm.com/Java/assistance/ibm-java.html www.software.ibm.com/webservers/appserv
www.awl.com/cseng/javaseries www.prenhall.com Not Just Java, Peter Van der Linden Teach Yourself Java 1.1 in 21 Days, 2nd Edition, Laura Lemay & Charles L. Perkins Java in a Nutshell, 2nd Edition Java Fundamental Classes Reference
E-7
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
SR23-7895 SR23-7787 SG24-2247 SG24-2109 SG24-2216 SG24-7006 SG24-2022 JavaWorld magazine online Web Review Articles and Tutorials on Servlets
JavaBeans for Dummies Client/Server Programming with Java and CORBA, Robert Orfali and Dan Harkey From Client/Server to Network Computing, A Migration to Java Java Network Security Creating Java Applications Using NetRexx Cooking with Beans in the Enterprise Component Broker Connector Overview www.javaworld.com webreview.com/97/10/10/feature/index.html
E-8
Introduction to XML
V3.1.0.1
Student Notebook
AP
Books
SR28-5685 SG24-4978 SG24-4993 SG24-5220 SG24-5233 Web Proxy Servers Managing AFS: Internet Firewalls and Network Security, 2nd edition Secure Electronic Transactions: Credit Card Payments on the Web in Theory and Practice IBM Interactive Network Dispatcher: Load-Balancing Internet Servers Internet Security in the Network Computing Framework IBM WebSphere Performance Pack Usage and Administration Ari Luotonen Prentice Hall The Andrew File System, Richard Campbell, IBM Web Traffic Express for Multiplatforms User's Guide
Related Topics/Technologies IBM Open Blueprint and Other Architectures Web Sites
IBM Open Blueprint Oracle NCA Microsoft DNA www.software.ibm.com/openblue www.oracle.com/nca www.microsoft.com/dna/overview.asp
E-9
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Books
SR23-7816 SR23-7979 SR23-7711 HTML Sourcebook, 3rd Edition Platinum Edition Using HTML Java and CGI - Eric Ladd and Jim O'Donnell HTML 3.2 & CGI Unleashed
www.icsa.net
Other Vendors
Web Sites
Adaptivity Apache (freeware) www.adaptivity.com www.apache.org www.apacheweek.com
V3.1.0.1
Student Notebook
AP
BEA (Tuxedo) Borland/Inprise Gemstone Kiva (Netscape Application Server) NCR Net Dynamics Novell Novera Persistence SilverStream SuperCede Sybase WebLogic
www.beasys.com www.inprise.com www.gemstone.com www.netscape.com/appserver/v2.1/index. html www.ncr.com www.netdynamics.com www.novell.com www.novera.com www.persistence.com www.silverstream.com www.supercede.com www.sybase.com www.weblogic.com
Miscellaneous
Web Sites
www.developer.ibm.com/welcome/java/javamap.ht ml www.networking.ibm.com
Miscellaneous Publications
Web Sites
IBM Redbooks InfoWorld magazine
Copyright IBM Corp. 2001, 2004
www.redbooks.ibm.com www.infoworld.com
Appendix E. Bibliography and References E-11
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Internal Pages
e-business/Framework
Web Sites
IBM e-business IBM e-business Insider IBM's e-business Strategy Internet Division's Information Center NCF SWGTechnology Page IBM IT Solution Architect 5 Minute University for NCF w3.ibm.com/e-business w3.software.ibm.com/ebusiness w3.strategy.ibm.com w3.nc.ibm.com ncf.austin.ibm.com/swgtc w3.ncs.ibm.com/cspaper.nsf - choose Topic, choose Architecture Briefs, choose Network Computing Framework
Java
Web Sites IBM Centre for Java Technology Development Java Information Hub The Java Special Interest Group VisualAge for Java ncc.hursley.ibm.com/javainfo/hurindex.html w3.java.ibm.com/Index.html w3.hursley.ibm.com/java/sig ncc.hursley.ibm.com/javainfo/vajava
V3.1.0.1
Student Notebook
AP
Enterprise Connectors
Web Sites MQ Flashes Hursley Demo page V06DBL02.hursley.ibm.com/m_dir/MQFlash.nsf tsdemoteam.hursley.ibm.com
Miscellaneous
Web Sites
Competitive Information Solution Developer Marketing Cross Platform Integration Test team (CPIT) ITSO Site gdlncntr.endicott.ibm.com/nclibrary/microsft.nsf w3sdo.austin.ibm.com/depts/ssqa/ncteam/ncvirtual. html w3.ncs.ibm.com/cpit/cpithome.nsf w3.itso.ibm.com
w3.ibm.com/e-business
Books
(Available on the internal ITSO site w3.itso.ibm.com)
Books
GG24-3376-05 TCP/IP Tutorial and Technical Overview
E-13
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
ZZ81-0475-00
An Approach to Designing e-business Solutions IBM WebSphere Performance Pack Usage and Administration
V3.1.0.1
Student Notebook
AP
B
B2B - Business to Business B2C - Business to Consumer BO - Business Object
C
C & S - Calendar and Schedule CA - Certification Authority CAE - Client Application Enabler CARP - Cache Array Routing Protocol CB - Component Broker CCF - Common Connector Framework CDF - Channel Definition Format CDK - Component Development Kit CDSA - Common Data Security Architecture CGI - Common Gateway Interface CICS - Customer Information Control System
Copyright IBM Corp. 2001, 2004 Appendix F. Acronyms and Abbreviations F-1
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
CIG - CICS Internet Gateway CIO - Chief Information Officer CMC - Communications Management Configuration COBOL - Common Business Oriented Language CORBA - Common Object Request Broker Architecture CRLF - Carriage Return / Line Feed CSR - Customer Service Representative CSS - Cascading Style Sheet
D
DAO - Data Access Objects DAP - Directory Access Protocol DBCS - Double Byte Character Set DB2 - Database 2 DBMS - Database Management System DCE - Distributed Computing Environment DCOM - Distributed Component Object Model DDM - Device Descriptor Module DECS - Domino Enterprise Connection Services DHTML - Dynamic Hypertext Markup Language DII - Dynamic Invocation Interface DIT - Directory Information Tree DN - Distinguished Name DNA - Microsoft Distributed interNet Applications Architecture DNS - Domain Name System DO - Data Object DPL - Distributed Program Link DRDA - Distributed Relational Database Architecture DRP - Distribution and Replication DSI - Dynamic Skeleton Interface
F-2
Introduction to XML
V3.1.0.1
Student Notebook
AP
E
E2E - End-to-End ECI - External Call Interface EJB - Enterprise JavaBean EJS - Enterprise JavaBeans Server eND - eNetwork Dispatcher EPI - External Presentation Interface ESP - Encapsulation Security Protocol EXCI - External CICS Interface FAT - File Allocation Table FIN - used in IP for socket termination Framework - Application Framework for e-business FTP - File Transfer Protocol FW - Firewall
G
Gbyte - Gigabyte GIF - Graphic Interchange Format GIOP - General Inter-ORB Protocol GSO - Global Sign-On GUI - Graphical User Interface GW - Gateway GWAPI - Go Webserver Application Programming Interface
H
HACMP - High-Availability Cluster Multi-Processing HP - Hewlett-Packard HPFS - High Performance File System HTML - Hypertext Mark-up Language
I
ICAPI - Internet Connection API
F-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
ICSS - Internet Connection Secure Server IDE - Integrated Development Environment IDL - Interface Definition Language IE - Internet Explorer IETF - Internet Engineering Task Force IIOP - Internet Inter-ORB Protocol IMAP - Internet Mail (or Message) Access Protocol IMS - Information Management System IP - Internet Protocol ISAPI - Internet Server API ISC - Intersystem Communication ISP - Internet Service Provider I/T - Information Technology
J
JAR - Java Archive JDBC - trademark, often referred to as "Java Database Connectivity" JDK - Java Developer's Kit JFC - Java Foundation Class JIT - Just In Time JNDI - Java Naming and Directory Interface JNI - Java Native Interface JPEG - Joint Photographic Experts Group JRE - Java Runtime Environment JSP - Java Server Pages JSQL - Java Structured Query Language JVM - Java Virtual Machine
L
LDAP - Lightweight Directory Access Protocol LEI - Lotus Enterprise Integrator LS:DO - LotusScript Data Object
F-4
Introduction to XML
V3.1.0.1
Student Notebook
AP
LSX - Lotus Script Extensions LUM - Logical Unit Manager LUW - Logical Unit of Work
M
MAPI - Messaging API MATM - MQSeries Link Application Transaction Map MB - Megabyte MCF - Meta Content File MFS - Message Format Services MHz - Megahertz MIME - Multipart Internet Mail Extension MOM - Message-Oriented Middleware MPR - Message Processing Region MQ - Message Queue MQEI - Message Queue Enterprise Integrator MQI - Message Queue Interface MQIIH - MQ IMS Information Header MS - Microsoft MSMQ - Microsoft Message Queue MTA - Message Transfer Agent MVS - Multiple Virtual Storage MW - Middleware
N
NC - Network Computer or Network Computing NCF - Network Computing Framework NCI - Network Communications Interface NCSA - National Computer Security Association NDS - Novell Directory Services NIS - Network Information Services NNTP - Network News Transfer Protocol
F-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
NOF - NetObjects Fusion NSAPI - Netscape Server API NSF - Notes database file extension NSTP - Notification Service Transfer Protocol NT - Windows NT (New Technology) NTFS - NT File System
O
OCX - Open Connect Exchange (OLE Custom Control) ODBC - Open Database Connectivity OLTP - On-line Transaction Processing OMG - Object Management Group OO - Object-Oriented ORB - Object Request Broker OSC - Open System Center OTMA - Open Transaction Manager Access
P
P&P - Policies and Procedures (security document) PCMCIA - Personal Computer Memory Card International Association PD - Problem Determination PERL - Practical Extraction & Reporting Language PICS - Platform for Internet Content Selection PKI - Public Key Infrastructure PKIX - Public Key Infrastructure Standard POP - Post Office Protocol
R
RACF - Resource Access Control Facility RAD - Rapid Application Development RDB - Relational Database RDBMS - Relational Database Management System RDN - Relative Distinguished Name
F-6 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
AP
RDO - Remote Data Objects RFC - Remote Function Call RMI - Remote Method Invocation RMIC - Remote Method Invocation Compiler RPC - Remote Procedure Call RSA - Rivest-Shamir-Adleman algorithm
S
SASL - Simple Authentication and Security Layer SET - Secure Electronic Transaction SHTTP - Secure Hypertext Transfer Protocol SMP - Symmetric Multiprocessors SMTP - Simple Mail Transfer Protocol SNA - Systems Network Architecture SPI - Service Provider Interface SQL - Structured Query Language SSL - Secure Sockets Layer SYN - used in IP for socket connection
T
TCP - Transmission Control Protocol TCP/IP - Transmission Control Protocol / Internet Protocol Telnet - U.S. Dept. of Defense virtual terminal protocol TLS - Transport Layer Security TME - Tivoli Management Environment TP - Transaction Processor TR - Token-Ring T-RPC - Transactional Remote Procedure Call TXSeries - Transaction Series
U
UA - User Agent URI - Uniform Resource Identifier
Copyright IBM Corp. 2001, 2004 Appendix F. Acronyms and Abbreviations F-7
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
V
VA - VisualAge VAJ - VisualAge for Java VB - Visual Basic VIM - Vendor Independent Messaging VM - Virtual Machine VPN - Virtual Private Network VTAM - Virtual Telecommunications Access Method
W
WAS - WebSphere Application Server WDS - WebSphere Development Studio WTE - Web Traffic Express WISIWIG - What You See Is What You Get W3C - World Wide Web Consortium
X
XCF - Cross-system Coupling Facility XML - Extensible Markup Language
F-8
Introduction to XML
V3.1.0.1
Student Notebook
AP
Appendix G. Glossary
A
abstract class
A class that provides common information for subclasses, and therefore cannot be instantiated. Abstract classes provide at least one abstract method. A method with a signature, but no implementation. You provide the implementation of the method in the subclass of the abstract class that contains the abstract method. The Abstract Window Toolkit API provides a layer between the application and the host's windowing system. It enables programmers to port Java applications from one window system to another. The AWT provides access to basic interface components such as events, colors, fonts, and controls such as buttons, scroll bars, text fields, frames, windows, dialogs, panels, canvases, and check boxes. A VisualAge for Java ToolKit for developing Java beans, Java applications, or Java applets that access SAP business objects. The Access Builder for SAP R/3 consists of R/3 Access Classes, Business Object Repository Access Classes, Logon Java beans, and the Access Builder tool. In VisualAge Developer Domain, the level at which you connect to the Web site. We provide the following access levels: ! Registration ! Subscription for Java ! Subscription for Java, CD-ROM version ! Enterprise Download Components
abstract method
access level
Parameters specified in a call to a method. See also formal parameter list. See Application Foundation Classes. See Application Programming Interface. A Java program designed to run within a Web browser. Contrast with application.
Appendix G. Glossary
G-1
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
application
In Java programming, a self-contained, stand-alone Java program that includes a static main method. It does not require an applet viewer. Contrast with applet. Microsoft's version of the Java Foundation Classes (JFCs). AFCs deliver similar functions to JFCs but only work on Windows 32-bit platforms. A software interface that enables applications to communicate with each other. An API is the set of programming language constructs or statements that can be coded in an application program to obtain the specific functions and services provided by an underlying operating system or service program. American Standard Code for Information Interchange. A standard assignment of 7-bit numeric codes to characters. See also Unicode. See Abstract Window Toolkit
ASCII
AWT
In Java, a type that establishes an interface to anything inherited from itself. See type, derived type. A definition or instance of a JavaBeans component. See JavaBeans. 1) A Java class that provides explicit information about the properties, events, and methods of a bean class. (2) In the VisualAge for Java Integrated Development Environment, a page in the class browser that provides bean information. (1) In VisualAge for Java, a window that provides information on program elements. There are browsers for projects, packages, classes, methods, and interfaces. (2) An Internet-based tool that lets users browse Web sites.
browser
G-2
Introduction to XML
V3.1.0.1
Student Notebook
AP business object (1) An object that represents a business function. Business objects contain attributes that define the state of the object, and methods that define the behavior of the object. A business object also has relationships with other business objects. Business objects can be used in combination to perform a desired task. Typical examples of business objects are Customer, Invoice, or Account. (2) In the Enterprise Access Builder, a class that implements the IBusinessObject interface. Machine-independent code generated by the Java compiler and executed by the Java interpreter.
bytecode
Explicitly converting an object or primitive's data type. A VisualAge for Java, Enterprise Edition tool that generates beans and C++ wrappers that let your Java programs access C++ DLLs. See Subscription for Java, CD-ROM version. A server program that processes CICS ECI calls, forwarding transaction requests to a CICS program running on a host. An API that provides C and C++ programs with procedural access to transactions. A server program that processes Java ECI calls and forwards CICS ECI calls to the CICS Client. An encapsulated collection of data and methods to operate on the data. A class may be instantiated to produce an object that is an instance of the class. The relationships between classes that share a single inheritance. All Java classes inherit from the Object class. Methods that apply to the class as a whole rather than its instances (also called a static method).
CD subscription CICS Client CICS ECI CICS Gateway for Java class
Appendix G. Glossary
G-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
class path
When running a program in VisualAge for Java, a list of directories and JAR files that contain resource files or Java classes that a program can load dynamically at run time. A program's class path is set in its Properties notebook. In your deployment environment, the environment variable keyword that specifies the directories in which to look for class and resource files. Variables that apply to the class as a whole rather than its instances (also called a static field). A networked computer in which the IDE is connected to a repository on a team server. An attribute of the <APPLET> tag that provides the relative pathname for the classes. Use this attribute when your class files reside in a different directory than your HTML files. In the Enterprise Access Builder, interface and class definitions that provide a consistent means of interacting with enterprise resources (for example, CICS and Encina transactions) from any Java execution environment. A specification produced by the Object Management Group (OMG) that presents standards for various types of object request brokers (such as client-resident ORBs, server-based ORBs, system-based ORBs, and library-based ORBs). Implementation of CORBA standards enables object request brokers from different software vendors to interoperate. A set of Java interfaces and classes that defines a middleware-independent layer to access R/3 systems from Java. If applications are built on top of this interface, they can leverage different middleware at run time without recoding. The generated beans are based on this interface and provide the same flexibility. An architecture and an API that allows developers to define reusable segments of code that can be combined to create a program. VisualAge for Java uses the JavaBeans component model. A bean that can contain both visual and nonvisual components. A composite bean is composed of embedded beans.
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
CLASSPATH
component model
composite bean
G-4
Introduction to XML
V3.1.0.1
Student Notebook
AP connection In the VisualAge for Java Visual Composition Editor, a visual link between two components that represents the relationship between the components. Each connection has a source, a target, and other properties. In VisualAge for Java, the window that acts as the standard input (System.in) and standard output (System.out) device for programs running in the VisualAge for Java environment. A method called to set up a new instance of a class. A component that can hold other components. In Java, examples of containers include applets, frames, and dialogs. In the Visual Composition Editor, containers can be graphically represented and generated. (1) A small file stored on an individual's computer; this file allows a site to tag the browser with a unique identification. When a person visits a site, the site's server requests a unique ID from the person's browser. If this browser does not have an ID, the server delivers one. On the Wintel platform, the cookie is delivered to a file called 'cookies.txt,' and on a Macintosh platform, it is delivered to 'MagicCookie.' Just as someone can track the origin of a phone call with Caller ID, companies can use cookies to track information about behavior. (2) Persistent data stored by the client in the Servlet Builder.t___ Common Objects Request Broker Architecture. Part of the minimal set of APIs that form the standard Java Platform. Core APIs are available on the Java Platform regardless of the underlying operating system. The Core API grows with each release of the JDK; the current core API is based on JDK 1.1.Also called core classes.
Console
constructor container
cookie
In the VisualAge for Java Visual Composition Editor, a bean that accesses and manipulates the content of JDBC/ODBC-compliant relational databases.
Appendix G. Glossary
G-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
A VisualAge for Java Enterprise tool that generates beans to access and manipulate the content of JDBC/ODBC-compliant relational databases. A component that assists in analyzing and correcting coding errors. Statement that creates an identifier and its attributes, but does not reserve storage or provide an implementation. Statement that reserves storage or provides an implementation. An obsolete component that may be deleted from a future version of a product. In Java, a type that overrides the definitions of a base type to provide unique behavior. The derived type extends the base type A metaphor, introduced by BeanExtender on alphaWorks, for modifying a component by hooking a special kind of Java bean onto it. Dipping lets you add new behavior or modify the Java bean's existing behavior without having to mess around with the Java bean's code. A dip is a special kind of Java bean that can be hooked on to another Java bean; it is the new feature you want to add to the component. Software examples of dips include printing and security. Dippable Java beans can have one or more dips connected to them. Almost any Java bean or class can be made dippable by extending it, a process called morphing. A special kind of Java bean that can be hooked on to another Java bean; the new feature you want to add to the component. Software examples of dips include printing and security. Processing that takes place across two or more linked systems.
dipping
dip
distributed processing
G-6
Introduction to XML
V3.1.0.1
Student Notebook
AP DLL (dynamic link library) A file containing executable code and data bound to a program at load time or run time, rather than during linking. The code and data in a dynamic link library can be shared by several applications simultaneously. The C++ Access Builder generates beans and C++ wrappers that let your Java programs access C++ DLLs. Enterprise Access Builders also generate platform-specific DLLs for the workstation and OS/390 platforms. A floating-point number that contains 64 bits. See also single precision.
double precision
EAB e-business
See Enterprise Access Builder. Either (a) the transaction of business over an electronic medium such as the Internet or (b) a business that uses Internet technologies and network computing in their internal business processes (via intranets), their business relationships (via extranets), and the buying and selling of goods, services, and information (via electronic commerce.) The subset of e-business that involves the exchange of money for goods or services purchased over an electronic medium such as the Internet. An API and application environment for high-volume embedded devices, such as mobile phones, pagers, process control, instrumentation, office peripherals, network routers and network switches. EmbeddedJava applications run on real-time operating systems and are optimized for the constraints of small-memory footprints and diverse visual displays The grouping of both data and operations into neat, manageable units that can be developed, tested, and maintained independently of one another. Such grouping is a powerful technique for building better software. The object manages its own resources and limits their visibility.
e-commerce
EmbeddedJava
encapsulation
Appendix G. Glossary
G-7
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Feature of Visual Age for Java, Enterprise Edition, that creates connectors to enterprise server products such as CICS, Encina, IMS TOC, and MQSeries. An access level for VisualAge Developer Domain that includes download versions of the latest components for VisualAge for Java, Enterprise Edition. You receive this access level when you purchase VisualAge for Java, Enterprise Edition, Version 2.0. See VisualAge for Java, Enterprise Edition Includes Enterprise JavaBeans as well as open API specifications for: database connectivity, naming and directory services, CORBA/IIOP interoperability, pure Java distributed computing, messaging services, managing system and network resources, and transaction services. A cross-platform component architecture for the development and deployment of multitier, distributed, scalable, object-oriented Java applications. A set of VisualAge for Java Enterprise tools that enable you to develop Java code that is targeted to specific platforms, such as AS/400, OS/390, OS/2, AIX, and Windows. See VisualAge for Java, Entry Edition. An action by a user, program, or system that may trigger specific behavior. In the JDK, events notify the relevant listener classes to take appropriate action. An exception is an object that has caused some sort of new condition, such as an error. In Java, throwing an exception means passing that object to an interested party; a signal indicates what kind of condition has taken place. Catching an exception means receiving the sent object. Handling this exception usually means taking care of the problem after receiving the object, although it might mean doing nothing (which would be bad programming practice). Code that runs from within an HTML file (such as an applet). A subclass or interface extends a class or interface if it add fields or methods, or overrides its methods. See also derived type.
Enterprise JavaBeans
Enterprise ToolKit
exception
G-8
Introduction to XML
V3.1.0.1
Student Notebook
AP
A bean that dynamically creates instances of beans. A data object in a class; for example, a variable. The client; the hardware and software with which the end user interacts. A set of object classes that provide a collection of related functions for a user or piece of software. In the VisualAge for Java Visual Composition Editor, the large, open area where you can work with visual and nonvisual beans. You add, remove, and connect beans on the free-form surface. In the Internet suite of protocols, an application layer protocol that uses TCP and Telnet services to transfer bulk-data files between machines or hosts. A generated class representing the HTML form elements in a visual servlet. See File Transfer Protocol. Parameters specified in a method's definition. See also actual parameter list.
File Transfer Protocol (FTP) form data FTP formal parameter list
garbage collection
Java's ability to clean up inaccessible unused memory areas ("garbage") on the fly. Garbage collection slows performance, but keeps the machine from running out of memory. A type of computer interface consisting of a visual metaphor of a real-world scene, often of a desktop. Within that scene are icons, representing actual objects, that the user can access and manipulate with a pointing device.
Appendix G. Glossary
G-9
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
hierarchy
The order of inheritance in object-oriented languages. Each class in the hierarchy inherits attributes and behavior from its superclass, except for the top-level Object class. A Java-enabled Web and intranet browser developed by Sun Microsystems, Inc. HotJava is written in Java. (Definition copyright 1996-1999 Sun Microsystems, Inc. All Rights Reserved. Used by permission.) A file format, based on SGML, for hypertext documents on the Internet. Allows for the embedding of images, sounds, video streams, form fields and simple text formatting. References to other objects are embedded using URLs, enabling readers to jump directly to the referenced document. The Internet protocol, based on TCP/IP, used to fetch hypertext objects from remote hosts.
HotJava
See Integrated Development Environment. The name of an item in a program In CORBA, a declarative language that is used to describe object interfaces, without regard to object implementation. In VisualAge for Java, an integrated IDL and Java development environment. The IDL Development Environment allows you to work with IDL source code in the multipane IDLs page and generate Java code using an IDL-to-Java compiler. A container used to hold IDL objects in the IDL Development Environment. It is similar to a file system directory. A communications standard for distributed objects that reside in Web or enterprise computing environments.
V3.1.0.1
Student Notebook
AP InfoBus A technology for flexible, vendor-independent data exchange which is used by eSuite and can be used by other applications to exchange data with eSuite and other InfoBus-enabled applications. The 100% Pure Java release and the InfoBus specification are available for free download from http://java.sun.com/beans/infobus. The ability to create subclasses that automatically inherit properties and methods from its superclass. See also hierarchy. In VisualAge for Java, a window in which you can evaluate code fragments in the context of an object, look at the entire contents of an object and its class, or access and modify the fields of an object. The specific representation of a class, also called an object. A method that applies and operates on objects (usually called simply a method). Contrast with class method. A variable that defines the attributes of an object. The class defines the instance variable's type and identifier, but the object sets and changes its values. In VisualAge for Java, the set of windows that provide the user with access to development tools. The primary windows are the Workbench, Log, Console, Debugger, and Repository Explorer. A list of methods that enables a class to implement the interface itself by using the implements keyword. The Interfaces page in the Workbench lists all interfaces in the workspace. In the Internet suite of protocols, a connectionless protocol that routes data through a network or interconnected networks. IP acts as an intermediary between the higher protocol layers and the physical network. However, this protocol does not provide error recovery and flow control and does not guarantee the reliability of the physical network. A tool that edits and generates CORBA-compliant Java modules. See Common Object Request Broker Architecture (CORBA).
inheritance
Inspector
Appendix G. Glossary
G-11
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
interpreter introspection
A tool that translates and executes code line-by-line. For a JavaBean to be reusable in development environments, there needs to be a way to query what the bean can do in terms of the methods it supports and the types of event it raises and listens for. Introspection allows a builder tool to analyze how a bean works. See Internet Protocol
IP
See Java Application Environment. JAR (Java Archive) is a platform-independent file format that aggregates many files into one. Multiple Java applets and their requisite components (.class files, images, sounds and other resource files) can be bundled in a JAR file and subsequently downloaded to a browser in a single HTTP transaction. An object-oriented programming language for portable, interpretive code that supports interaction among remote objects. Java was developed and specified by Sun Microsystems, Incorporated. The Java environment consists of the JavaOS, the Virtual Machines for various platforms, the object-oriented Java programming language, and several class libraries. The source code release of the Java (TM) Development Kit. (Definition copyright 1996-1999 Sun Microsystems, Inc. All Rights Reserved. Used by permission.) Java's component architecture, developed by Sun, IBM, and others. The components, called Java beans, can be parts of Java programs, or they can exist as self-contained applications. Java beans can be assembled to create complex applications, and they can run within other component architectures (such as ActiveX and OpenDoc). In the JDK, the specification that defines an API that enables programs to access databases that comply with this standard.
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Java
V3.1.0.1
Student Notebook
AP Java Development Kit (JDK) The Java Development Kit is the set of Java technologies made available to licensed developers by Sun Microsystems. Each release of the JDK contains the following: the Java Compiler, Java Virtual Machine, Java Class Libraries, Java Applet Viewer, Java Debugger, and other tools. Sun's tool for generating HTML documentation on classes by extracting comments from the Java source code files. Developed by Netscape, Sun, and IBM, JFCs are building blocks that are helpful in developing interfaces to Java applications. They allow Java applications to interact more completely with the existing operating systems. Also called Swing Set. Java IDL is a language-neutral way to specify an interface between an object and its client on a different platform. Provides interoperability and integration with CORBA, the industry standard for distributed computing, allowing developers to build Java applications that are integrated with heterogeneous business information assets. A specification proposed by Sun Microsystems that defines a core set of application programming interfaces for developing tightly integrated system, network, and service management applications. The application programming interfaces could be used in diverse computing environments that encompass many operating systems, architectures, and network protocols. Allows developers to integrate a wide range of media types into their Web pages, applets, and applications. Includes: Media, Sound, Animation, 2D, 3D, Telephony, Speech and Collaboration. The JMF API specifies a unified architecture, messaging protocol and programming interface for media players, capture and conferencing. JMF provides a set of building blocks useful by other areas of the Java Media API suite. For example, the JMF provides access to audio devices in a cross-platform, device-independent manner, which is required by both the Java Telephony and the Java Speech APIs. JMF will be published as three APIs: the Java Media Player, Java Media Capture, and Java Media Conference.
Java IDL
Appendix G. Glossary
G-13
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Java Naming and Directory Interface (JNDI) Java Native Interface (JNI )
A set of APIs that assist with the interfacing to multiple naming and directory services. (Definition copyright 1996-1999 Sun Microsystems, Inc. All Rights Reserved. Used by permission.) A native programming interface that allows Java code running inside a Java Virtual Machine (VM) to interoperate with applications and libraries written in other programming languages, such as C and C++. In Remote Method Invocation, the name of the user-defined default file that contains a list of server objects to be instantiated when the Remote Object Instance Manager is started. A basic, small-footprint operating system that supports Java. Java OS was originally designed to run in small electronic devices like phones and TV remotes, but it is also being targeted for use in network computers (NCs). The Java Virtual Machine and the Java Core classes make up the Java Platform. The Java Platform provides a uniform programming interface to a 100% Pure Java program regardless of the underlying operating system. (Definition copyright 1996-1999 Sun Microsystems, Inc. All Rights Reserved. Used by permission.) An editor that allows you to construct and refine dynamic record types. A Java framework that describes and converts record data. Method invocation between peers, or between client and server, when applications at both ends of the invocation are written in Java. Included in JDK 1.1. A subset of the Java Development Kit for end-users and developers who want to redistribute the JRE. The JRE consists of the Java Virtual Machine, the Java Core Classes, and supporting files. (Definition copyright 1996-1999 Sun Microsystems, Inc. All Rights Reserved. Used by permission.)
JavaObjs
JavaOS
Java Platform
Java Record Editor Java Record Framework Java Remote Method Invocation (RMI) Java Runtime Environment (JRE)
V3.1.0.1
Student Notebook
AP JavaScript A scripting language used within an HTML page. Superficially similar to Java but JavaScript scripts appear as text within the HTML page. Java applets, on the other hand, are programs written in the Java language and are called from within HTML pages or run as stand-alone applications. A framework for developers to include security functionality in their applets and applications. Includes: cryptography with digital signatures, encryption, and authentication. An intermediate subset of the Security API known as "Security and Signed Applets" is included in JDK 1.1. An extensible framework that enables and eases the development of Java-powered Internet and intranet servers. The APIs provide uniform and consistent access to the server and administrative system resources required for developers to quickly develop their own Java servers. A software implementation of a central processing unit (CPU) that runs compiled Java code (applets and applications). IBM's powerful Java search engine, accessible from the Search field at the top of every VisualAge Developer Domain page. Simply select jCentral in the in entry field, and jCentral searches the entire Web for Java information and Java components such as applets, Java beans, and EJBs. Search results are sorted by relevance. See Java Database Connectivity. See Java Foundation Classes. See Just-In-Time Compiler. See Java Media Framework. See Java Naming and Directory Interface. See Java Native Interface. See Java Runtime Environment. A platform-specific software compiler often contained within JVMs. JITs compile Java bytecodes on-the-fly into native machine instructions, thereby reducing the need for interpretation. See Java Virtual Machine.
Appendix G. Glossary G-15
Java Server
JDBC JFC JIT JMF JNDI JNI JRE Just-In-Time compiler (JIT)
JVM
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
linker
A computer program for creating load modules from one or more object modules or load modules by resolving cross references among the modules and, if necessary, adjusting addresses. In Java, the linker creates an executable from compiled classes. In the JDK, a class that receives and handles events. A variable declared and used within a method or block. In the VisualAge for Java IDE, the window that displays messages and warnings during development.
member
(1) In the Java language, an item belonging to a class, such as a field or method. (2) On VADD, a site visitor who has previously registered. See registered member, registration. A fragment of Java code within a class that can be invoked and passed a set of parameters to perform a specific task A layer of software that sits between a database client and a database server, making it easier for clients to connect to heterogeneous databases. The hardware and software that resides between the client and the enterprise server resources and data. The software includes a Web server that receives requests from the client and invokes Java servlets to process these requests. The client communicates with the Web server via industry standard protocols such as HTTP and IIOP. The process of extending a Java bean to accept dips. Morphed Java beans are called dippable Java beans and can have one or more dips connected to them. Almost any Java bean or class can be made dippable. See dipping.
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
method middleware
middle tier
morphing
V3.1.0.1
Student Notebook
AP multithreaded A program where different parts can run at the same time without interfering with each other.
native class
Machine-dependent C code that can be invoked from Java. For multi-platform work, the native routines for each platform need to be implemented. See Network Computing Framework. An architecture and programming model created to help customer and industry software development teams to design, deploy, and manage e-business solutions across the enterprise. In the Internet suite of protocols, a protocol for the distribution, inquiry, retrieval, and posting of news articles that are stored in a central database. A bean that is not visible to the end user in the graphical user interface, but is visually represented on the free-form surface of the Visual Composition Editor during development. Developers can manipulate nonvisual beans only as icons; that is, they cannot edit them in the Visual Composition Editor as they can edit visual beans. Examples of nonvisual beans include beans for business logic, communication access, and database queries. See Network News Transfer Protocol.
NNTP
object
The principal building block of object-oriented programs. Objects are software programming modules. Each object is a programming unit consisting of related data and methods.
Appendix G. Glossary
G-17
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
In object-oriented programming, software that serves as an intermediary by transparently enabling objects to exchange requests and responses. A software design method that models the characteristics of abstract or real objects using classes and objects. Object-oriented design focuses on the data and on the interfaces to it. For instance, an "object-oriented" carpenter would be mostly concerned with the chair he was building, and secondarily with the tools used to make it; a "non-object-oriented" carpenter would think primarily of his tools. Object-oriented design is also the mechanism for defining how modules "plug and play." The object-oriented facilities of Java are essentially those of C++, with extensions from Objective C for more dynamic method resolution The ability to have different methods with the same identifier, distinguished by their return type, and number and type of arguments. Implementing a method in a subclass that replaces a method in a superclass.
overloading
overriding
package part
A program element that contains classes and interfaces. An existing, reusable software component. All parts created with the Visual Composition Editor conform to the JavaBeans component model, and are referred to as beans. See visual bean and nonvisual bean. In object models, a condition that allows instances of classes to be stored externally, for example in a relational database. In VisualAge for Java, a persistence framework for object models, which enables the mapping of objects to information stored in relational databases and also provides linkages to legacy data on other systems. A program executing in its own address space, containing one or more threads.
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
process
V3.1.0.1
Student Notebook
AP Professional Edition program program element project property See VisualAge for Java, Professional Edition. In VisualAge for Java, a term that refers to both Java applets and applications. In VisualAge for Java, a generic term for a project, package, class, interface, or method. In VisualAge for Java, the topmost kind of program element. A project contains Java packages. An initial setting or characteristic of a bean, for example, a name, font, text, or positional characteristic.
An object's address. In Java, objects are passed by reference rather than by value or by pointers. In VisualAge Developer Domain, a user who has submitted the registration information. See also registration, subscriber, and Subscription for Java. The process of submitting user information to VisualAge Developer Domain. You must register in order to access technical information from the site library, such as technical articles and IBM "Redbooks". You can also access free downloads such as VisualAge for Java, Entry Edition, and information-viewing utilities such as Netscape Navigator and Lotus Freelance. To register, click Register on the VADD site masthead, and enter the requested information. See also registered member, subscription, Subscription for Java. A debugging tool that debugs code on a remote platform. SAP's open programmable interface. External applications and tools can call ABAB/4 functions from the SAP System. You can also call third party applications from the SAP System using RFC. RFC is a means for communication that allows implementation on all R/3 platforms.
registration
Appendix G. Glossary
G-19
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
RMI is a specific instance of the more general term RPC. RMI allows objects to be distributed over the network; that is, a Java program running on one computer can call the methods of an object running on another computer. RMI and java.net are the only 100% pure Java APIs for controlling Java objects in remote systems. In Remote Method Invocation, a program that creates and manages instances of server beans through their associated server-side server proxies. RPC is a generic term referring to any of a series of protocols used to execute procedure calls or method calls across a network. RPC allows a program running on one computer to call the services of a program running on another computer. In VisualAge for Java, the permanent storage area containing all open and versioned editions of all program elements, regardless of whether they are currently in the workspace. The repository contains the source code for classes developed in (and provided with) VisualAge for Java, and the bytecode for classes imported from the file system. Every time you save a method in the IDE, it is automatically updated in the repository. See also SCM repository and shared repository In VisualAge for Java, the window from which you can view and compare editions of program elements that are in the repository. A non-code file that may be referred to from your Java program in VisualAge for Java. Examples include graphic and audio files. See Remote Method Invocation. A VisualAge for Java Enterprise tool that generates proxy beans and associated classes and interfaces so you can distribute code for remote access, enabling Java-to-Java solutions. The compiler that generates stub and skeleton files that facilitate RMI communication. This compiler can be automatically invoked by the RMI Access Builder, and can also be invoked from the Tools menu item.
Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
repository
Repository Explorer
resource file
RMI compiler
V3.1.0.1
Student Notebook
AP RMI registry roll back RPC runtime system A server program that allows remote clients to get a reference to a server bean. The process of restoring data changed by SQL statements to the state at its last commit point.` See Remote Procedure Calls The software environment where compiled programs run. Each Java runtime system includes an implementation of the Java Virtual Machine.
sandbox
A restricted environment, provided by the Web browser, in which Java applets run. The sandbox offers them services and prevents them from doing anything naughty, such as doing file I/O or talking to strangers (servers other than the one from which the applet was loaded). The analogy of applets to children led to calling the environment in which they run the "sandbox.". See software configuration management. In VisualAge for Java, a generic term for the data store of any external software configuration management (SCM) tool. Some SCM tools refer to this as an archive. Determines where an identifier can be used. In Java, instance and class variables have a scope that extends to the entire class. All other identifiers are local to the method where they are declared. In VisualAge for Java, the window from which you can write, edit, and test fragments of code without having to define an encompassing class or method. SSL is a security protocol which allows communications between a browser and a server to be encrypted and secure. SSL prevents eavesdropping, tampering or message forgery on your Internet or intranet network.
scope
Scrapbook
Appendix G. Glossary
G-21
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
security
Features in Java that prevent applets downloaded off the Web from deliberately or inadvertently doing damage. One such feature is the digital signature, which ensures that an applet came unmodified from a reputable source. Turning an object into a stream, and back again. The computer that hosts the Web page that contains an applet. The .class files that make up the applet, and the .HTML files that reference the applet reside on the server. When someone on the Internet connects to a Web page that contains an applet, the server delivers the .class files over the Internet to the client that made the request. The server is also known as the originating host.___ The bean that is distributed using RMI services and is deployed on a server. Server-side programs that execute on and add function to Web servers. Java servlets allow for the creation of complicated, high-performance, cross-platform Web applications. They are highly extensible and flexible, making it easy to expand from client or single-server applications to multitier applications. See Standardized Generalized Markup Language. A floating-point number that contains 32 bits. See also double precision. In IBM software products, an active form of help that guides you through common tasks. The tracking and control of software development. SCM tools typically offer version control and team programming features. Structured Query Language. A language used by database engines and servers for data acquisition and definition. See secure socket layer An ISO/ANSI/ECMA standard that specifies a way to annotate text documents with information about types of sections of a document. See class variable.
serialization server
SGML single precision SmartGuide software configuration management (SCM) SQL SSL Standardized Generalized Markup Language static field
V3.1.0.1
Student Notebook
AP static method stored procedure See class method. A procedure that is part of a relational database. The Data Access Builder can generate Java code that accesses stored procedures. A communication path between a source of information and its destination. A class that inherits all the methods and variables of another class (its superclass). Its superclass might be a subclass of another class in the hierarchy. In VisualAge Developer Domain, a user that has purchased a Subscription for Java or received a subscription as part of VisualAge for Java, Enterprise Edition. In VisualAge Developer Domain, a paid access level to the Web site. Subscribing to the site entitles you to VisualAge for Java, Professional Edition and the Java Beans and tools, as well as access to all the information and trial downloads available with registration. See also registration, Subscription for Java, and Subscription for Java, CD-ROM version. A subscription level that includes the latest version of VisualAge for Java, Professional Edition, and an ever-increasing supply of JavaBeans and Java-related tools. The Subscription for Java also gives you access to new beans, tools, products, fixes, product updates, and Beta versions as they become available during the one-year subscription period. A subscription that includes set of VisualAge Developer Domain CD-ROMs three times a year, in addition to complete Web access. The CDs include the product code for your subscription level, as well as most of the current information on the Web site. You can view and search the CD information using any Web browser, just as you would on the Web (but with quicker response). See also subscription. See VADD JavaBeans and tools. See access level. A type that extends another type (its supertype).
stream subclass
subscriber
subscription
Appendix G. Glossary
G-23
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
A class that defines the methods and variables inherited by another class (its subclass). A type that is extended by another type (its subtype). A group of lightweight, ready-to-use components developed by JavaSoft. The components range from simple buttons to full-featured text areas to tree views and tabbed folders. This Java keyword specifies that only one thread can run inside a method at once.
synchronized
See Transmission Control Protocol based on IP. Thin client usually refers to a system that runs on a resource-constrained machine or that runs a small operating system. Thin clients don't require local system administration, and they execute Java applications delivered over the network. The third tier, or back end, is the hardware and software that provides database and transactional services. These back-end services are accessed through connectors between the middle-tier Web server and the third-tier server. Though this conceptual model depicts the second and third tier as two separate machines, the NCF model supports a logical three-tier implementation in which the software on the middle and third tier are on the same box. A separate flow of control within a program. (1) In a CICS program, an event that queries or modifies a database that resides on a CICS server. (2) In the Persistence Builder, a representation of a path of code execution. (3) The code activity necessary to manipulate a persistent object. For example, a bank application might have a transaction that updates a company account. This Java keyword specifies that a field is not included in the serial representation of an object. See serialization.
third tier
thread transaction
transient
V3.1.0.1
Student Notebook
AP Transmission Control Protocol based on IP type An Internet protocol that provides for the reliable delivery of streams of data from one host to another. In VisualAge for Java, a generic term for a class or interface
The unique address that tells a browser how to find a specific Web page or file. A 16-bit international character set defined by ISO 10646. See also ASCII. See Uniform Resource Locator.
See VisualAge Developer Domain. A set of beans and bean tools provided with the VisualAge Subscription for Java, which use to be named the WebRunner toolkit. An identifier that represents a data item whose value can be changed while the program is running. The values of a variable are restricted to a certain data type. A software or hardware implementation of a central processing unit (CPU) that manages the resources of a machine and can run compiled code. See Java Virtual Machine. In the Visual Composition Editor, a bean that is visible to the end user in the graphical user interface. In VisualAge for Java, the tool you can use to create graphical user interfaces from prefabricated beans, and to define relationships (called connections) between beans. The Visual Composition Editor is a page in the class browser.
virtual machine
Appendix G. Glossary
G-25
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
A servlet that is designed to be built using the VisualAge for Java Visual Composition Editor. A Web site and Web-based subscription offering for VisualAge for Java, providing downloads and CDs of VisualAge for Java products. The Web site also provides a wealth of supporting components, tools, and how-to information to help programmers easily develop Java applications. An edition of VisualAge for Java that is designed for building enterprise Java applications, and has all of the Professional Edition features plus support for developers working in large teams, developing high-performance or heterogeneous applications, or needing to connect Java programs to existing enterprise systems. An edition of VisualAge for Java suitable for learning and building small projects of 500 classes or less. It is available as a no-charge download from VisualAge for Java and VisualAge Developer Domain Web sites. A complete Java development environment, including easy access to JDBC-enabled databases for building Java applications.
Synonym for Subscription for Java. See VADD JavaBeans and tools. WebSphere is the cornerstone of IBM's overall Web strategy, offering customers a comprehensive solution to build, deploy and manage e-business Web sites. The product line provides companies with an open, standards-based, Web server deployment platform and Web site development and management tools to help accelerate the process of moving to e-business. A permission level on Web servers specifying that files can be read by any user.
V3.1.0.1
Student Notebook
AP World Wide Web A network of servers that contain programs and files. Many of the files contain hypertext links to other documents available through the network. In VisualAge for Java, the main window from which you can manage the workspace, create and modify code, and open browsers and other tools. The work area that contains the Java code that you are developing and the class libraries on which your code depends. Program elements must be added to the workspace from the repository before they can be modified. Code that provides an interface for one program to access the functionality of another program. See World Wide Web.
Workbench
workspace
wrapper WWW
Numerics
Sun Microsystems initiative to certify that applications and applets are purely Java-written.
Appendix G. Glossary
G-27
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
V3.1.0.1
Student Notebook
AP
Unit 3
1. Basic XML can be described as: a. A hierarchical structure of tagged elements, attributes and text. [CORRECT] b. All the HTML tags plus a set of new XML only tags. c. Object-oriented structure of rows and columns. d. Processing instructions (PIs) for text data.
H-1
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
e. Textual data with tags for visual presentation. 2. Which of these XML Fragments is not well formed? a. <root><class>XML</class></root> b. <class><root>XML</root></class> c. <root><class id=XML></root> [CORRECT] d. <root>XML<class id=XML/>XML</root> e. <root class=XML><class id root/>XML</root> 3. XML Comments are allowed (Select all that apply): a. Before the XML declaration b. Anywhere c. Between element tags [CORRECT] d. Before the root element [CORRECT] e. All of the Above 4. Which of these XML Elements with Attributes is invalid? a. <name first='Tony' LAST=Romeo /> b. <name name=Tony NAME=ROMEO /> c. <_name_ first-name=Tony last-name=Romeo /> d. <name=Tony Romeo /> [CORRECT] e. <name name=first='Tony' last='Romeo'" /> f. All of the Above 5. Which of these comments regarding HTML and XML is not true? a. HTML markup is focused on presentation. b. XML markup is based on defining the data. c. XML is based on HTML. [CORRECT] d. HTML tags are not case sensitive. e. XML tags are case sensitive. f. Both XML and HTML support attributes.
Unit 5
1. Which DTD entry correctly depicts a phone number, with optional area code? a. <!ELEMENT phone ( (areaCode)*, prefix, body ) > b. <!ELEMENT phone ( areaCode?, prefix, body ) > [CORRECT]
H-2 Introduction to XML Copyright IBM Corp. 2001, 2004
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
V3.1.0.1
Student Notebook
AP
c. <!ELEMENT phone? ( areaCode, prefix, body ) > d. <!ELEMENT phone ( areaCode, (prefix, body)+ ) > 2. Which of the following is a limitation of DTD? a. Non-XML Syntax. b. Does not allow range of values (that is, 5 to 10 elements). c. Does not provide proper typing of values (that is, integer versus string). d. Does not permit Parameter Entity references. [CORRECT] e. All of the above. 3. Which DTD entry correctly depicts an optional attribute named type for a pet element, that defaults to the value "dog"? a. <!ATTLIST pet type CDATA #IMPLIED> b. <!ATTLIST type dog CDATA #FIXED "dog"> c. <!ATTLIST pet type CDATA "dog"> [CORRECT] d. <!ATTLIST pet (dog)? CDATA #REQUIRED>
Unit 6
1. Which is true of XML namespaces? a. They are stored in an internet based registry. b. They are associated with URIs. [CORRECT] c. They are integrated with DTDs. d. They are integrated with XML Schema. [CORRECT] 2. An XML namespace prefix (select all that apply): a. Links to a Schema definition. b. Is scoped to the element where it is defined. [CORRECT] c. Is short hand for a URI - CORRECT. d. Can stand for more than one namespace. [CORRECT] 3. Default namespaces apply to: a. Elements [CORRECT] b. Attributes c. Elements and attributes d. Neither elements nor attributes
H-3
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
Unit 8
1. Which of the following are part of the XPath step syntax:? a. Predicate [CORRECT] b. AxisName [CORRECT] c. Ancestor d. Ceiling e. NodeTest [CORRECT] 2. The Axis shorthand notation of // indicates what? a. Ancestor b. Parent c. Ancestor-or-self d. Descendant-or-self [CORRECT] 3. Which XPath statement will return the number of questions on a test? a. count(/test/question) [CORRECT] b. /test/question/count() c. /test[count(question)] d. None of the above 4. The predicate function starts-with(XML is Great, XML) will return: a. XML b. true [CORRECT] c. is Great d. false e. XML is Great 5. The following XPath statement will result in -/news/story[@year='2001']/self::node()[contains(text,'IBM')]/
a. All 2001 news stories that contain IBM inside the text element. [CORRECT] b. All new stories with a year element = 2001 and a text element of IBM. c. Any news story with either IBM or 2001 in its text. d. All 2001 news stories that contain the letters IBM in any order. e. Error, as this is an invalid XPath statement.
H-4
Introduction to XML
V3.1.0.1
Student Notebook
AP
Unit 9
1. How can XML documents be transformed? a. XPATH b. XSLT [CORRECT] c. Notepad d. Xatran 2. Is XSL Stylesheet a XML document? a. Yes [CORRECT] b. No c. Depends on the header d. Only if it is applied to a XML document 3. What template would you use for extracting a specific value from the source tree? a. <xsl:choose... b. <xsl:copy... c. <xsl:value-of select=... [CORRECT] d. <xsl:text
Appendix A
1. How can an XML document be stored in an RDB? (Select all that apply.) a. In a Table column (CLOB) [CORRECT] b. SGML c. Decomposed into different columns/tables [CORRECT] d. Into a DTD file e. Compressed into an integer column 2. While RDBs are row-based, XML documents are: a. Record based b. Hierarchical [CORRECT] c. Obsolete d. Rectangular 3. I should use an RDB to store my XML if: a. I have lots of proprietary file formats
H-5
Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Student Notebook
b. I need to retrieve large number of documents based on a specific element [CORRECT] c. I need to exchange data with a business partner d. I need to represent my data in Esperanto
H-6
Introduction to XML
V3.1
backpg
Back page