Xmlschema Readthedocs Io en Latest
Xmlschema Readthedocs Io en Latest
Release 1.5.0
Davide Brunato
1 Introduction 1
1.1 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Usage 3
2.1 Create a schema instance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Data decoding and encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 XML resources and documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 Other features 11
3.1 XSD 1.0 and 1.1 support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 CLI interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 XSD validation modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.4 Lazy validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.5 XML entity-based attacks protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.6 Security modes on accessing resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.7 Processing limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
5 Schema components 17
5.1 Accessing schema components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.2 Component structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.3 XSD types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
6 Testing 23
6.1 Test scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.2 Test cases based on files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.3 Testing with the W3C XML Schema 1.1 test suite . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.4 Direct testing of schemas and instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
7 Extra features 27
i
7.1 Code generation with Jinja2 templates (experimental) . . . . . . . . . . . . . . . . . . . . . . . . . 27
7.2 WSDL 1.1 documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
A Package API 29
A.1 Errors and exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
A.2 Document level API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
A.3 Schema level API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
A.4 Global maps API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
A.5 Converters API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
A.6 Data objects API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
A.7 XML resources API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
A.8 XPath API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.9 Validation API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
A.10 Particles API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
A.11 Main XSD components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
A.12 Other XSD components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
A.13 Extra features API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Index 61
ii
CHAPTER 1
Introduction
The xmlschema library is an implementation of XML Schema for Python (supports Python 3.6+).
This library arises from the needs of a solid Python layer for processing XML Schema based files for MaX (Materials
design at the Exascale) European project. A significant problem is the encoding and the decoding of the XML data
files produced by different simulation software. Another important requirement is the XML data validation, in order
to put the produced data under control. The lack of a suitable alternative for Python in the schema-based decoding of
XML data has led to build this library. Obviously this library can be useful for other cases related to XML Schema
based processing, not only for the original scope.
The full xmlschema documentation is available on “Read the Docs”.
1.1 Features
1
xmlschema Documentation, Release 1.5.0
1.2 Installation
You can install the library with pip in a Python 3.6+ environment:
The library uses the Python’s ElementTree XML library and requires elementpath additional package. The base
schemas of the XSD standards are included in the package for working offline and to speed-up the building of schema
instances.
1.3 License
The xmlschema library is distributed under the terms of the MIT License.
1.4 Support
This software is hosted on GitHub, refer to the xmlschema’s project page for source code and for an issue tracker.
2 Chapter 1. Introduction
CHAPTER 2
Usage
Import the library and then create an instance of a schema using the path of the file containing the schema as argument:
The argument can be also a file-like object or a string containing the schema definition:
Strings and file-like objects might not work when the schema includes other local subschemas, because the package
cannot knows anything about the schema’s source location:
Schema:
3
xmlschema Documentation, Release 1.5.0
Path: /xs:schema/xs:element/xs:complexType/xs:sequence/xs:element
In these cases you can provide an appropriate base_url optional argument to define the reference directory path for
other includes and imports:
2.2 Validation
A schema instance has methods to validate an XML document against the schema.
The first method is XMLSchema.is_valid(), that returns True if the XML argument is validated by the schema
loaded in the instance, and returns False if the document is invalid.
An alternative mode for validating an XML document is implemented by the method XMLSchema.validate(),
that raises an error when the XML doesn’t conforms to the schema:
raise error
xmlschema.exceptions.XMLSchemaValidationError: failed validating <Element ...
Schema:
<xs:sequence xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element maxOccurs="unbounded" minOccurs="0" name="car" type=
˓→"vh:vehicleType" />
</xs:sequence>
Instance:
<ns0:cars xmlns:ns0="http://example.com/vehicles">
NOT ALLOWED CHARACTER DATA
<ns0:car make="Porsche" model="911" />
(continues on next page)
4 Chapter 2. Usage
xmlschema Documentation, Release 1.5.0
A validation method is also available at module level, useful when you need to validate a document only once or if
you extract information about the schema, typically the schema location and the namespace, directly from the XML
document:
>>> xmlschema.validate('tests/test_cases/examples/vehicles/vehicles.xml')
A schema instance can be also used for decoding an XML document to a nested dictionary:
>>> import xmlschema
>>> from pprint import pprint
>>> xs = xmlschema.XMLSchema('tests/test_cases/examples/vehicles/vehicles.xsd')
>>> pprint(xs.to_dict('tests/test_cases/examples/vehicles/vehicles.xml'))
{'@xmlns:vh': 'http://example.com/vehicles',
'@xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance',
'@xsi:schemaLocation': 'http://example.com/vehicles vehicles.xsd',
'vh:bikes': {'vh:bike': [{'@make': 'Harley-Davidson', '@model': 'WL'},
{'@make': 'Yamaha', '@model': 'XS650'}]},
'vh:cars': {'vh:car': [{'@make': 'Porsche', '@model': '911'},
{'@make': 'Porsche', '@model': '911'}]}}
The decoded values match the datatypes declared in the XSD schema:
>>> import xmlschema
>>> from pprint import pprint
>>> xs = xmlschema.XMLSchema('tests/test_cases/examples/collection/collection.xsd')
>>> pprint(xs.to_dict('tests/test_cases/examples/collection/collection.xml'))
{'@xmlns:col': 'http://example.com/ns/collection',
'@xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance',
'@xsi:schemaLocation': 'http://example.com/ns/collection collection.xsd',
'object': [{'@available': True,
'@id': 'b0836217462',
'author': {'@id': 'PAR',
'born': '1841-02-25',
'dead': '1919-12-03',
'name': 'Pierre-Auguste Renoir',
'qualification': 'painter'},
'estimation': Decimal('10000.00'),
'position': 1,
'title': 'The Umbrellas',
'year': '1886'},
{'@available': True,
'@id': 'b0836217463',
'author': {'@id': 'JM',
'born': '1893-04-20',
'dead': '1983-12-25',
(continues on next page)
˓→collection.xsd">
All the decoding and encoding methods are based on two generator methods of the XMLSchema class, namely
iter_decode() and iter_encode(), that yield both data and validation errors. See Schema level API section for more
information.
If you need to decode only a part of the XML document you can pass also an XPath expression using the path
argument.
>>> xs = xmlschema.XMLSchema('tests/test_cases/examples/vehicles/vehicles.xsd')
>>> pprint(xs.to_dict('tests/test_cases/examples/vehicles/vehicles.xml', '/
˓→vh:vehicles/vh:bikes'))
6 Chapter 2. Usage
xmlschema Documentation, Release 1.5.0
Note: An XPath expression for the schema considers the schema as the root element with global elements as its
children.
Validation and decode API works also with XML data loaded in ElementTree structures:
>>> import xmlschema
>>> from pprint import pprint
>>> from xml.etree import ElementTree
>>> xs = xmlschema.XMLSchema('tests/test_cases/examples/vehicles/vehicles.xsd')
>>> xt = ElementTree.parse('tests/test_cases/examples/vehicles/vehicles.xml')
>>> xs.is_valid(xt)
True
>>> pprint(xs.to_dict(xt, process_namespaces=False), depth=2)
{'@{http://www.w3.org/2001/XMLSchema-instance}schemaLocation': 'http://...',
'{http://example.com/vehicles}bikes': {'{http://example.com/vehicles}bike': [...]},
'{http://example.com/vehicles}cars': {'{http://example.com/vehicles}car': [...]}}
The standard ElementTree library lacks of namespace information in trees, so you have to provide a map to convert
URIs to prefixes:
>>> namespaces = {'xsi': 'http://www.w3.org/2001/XMLSchema-instance', 'vh': 'http://
˓→example.com/vehicles'}
You can also convert XML data using the lxml library, that works better because namespace information is associated
within each node of the trees:
>>> import xmlschema
>>> from pprint import pprint
>>> import lxml.etree as ElementTree
>>> xs = xmlschema.XMLSchema('tests/test_cases/examples/vehicles/vehicles.xsd')
>>> xt = ElementTree.parse('tests/test_cases/examples/vehicles/vehicles.xml')
>>> xs.is_valid(xt)
True
>>> pprint(xs.to_dict(xt))
{'@xmlns:vh': 'http://example.com/vehicles',
'@xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance',
'@xsi:schemaLocation': 'http://example.com/vehicles vehicles.xsd',
'vh:bikes': {'vh:bike': [{'@make': 'Harley-Davidson', '@model': 'WL'},
{'@make': 'Yamaha', '@model': 'XS650'}]},
'vh:cars': {'vh:car': [{'@make': 'Porsche', '@model': '911'},
{'@make': 'Porsche', '@model': '911'}]}}
(continues on next page)
Starting from the version 0.9.9 the package includes converter objects, in order to control the decoding process and
produce different data structures. These objects intervene at element level to compose the decoded data (attributes and
content) into a data structure.
The default converter produces a data structure similar to the format produced by previous versions of the package. You
can customize the conversion process providing a converter instance or subclass when you create a schema instance or
when you want to decode an XML document. For instance you can use the Badgerfish converter for a schema instance:
You can also change the data decoding process providing the keyword argument converter to the method call:
See the Converters for XML data section for more information about converters.
The data structured created by the decoder can be easily serialized to JSON. But if you data include Decimal values
(for decimal XSD built-in type) you cannot convert the data to JSON:
8 Chapter 2. Usage
xmlschema Documentation, Release 1.5.0
This problem is resolved providing an alternative JSON-compatible type for Decimal values, using the keyword argu-
ment decimal_type:
From version 1.0 there are two module level API for simplify the JSON serialization and deserialization task. See the
xmlschema.to_json() and xmlschema.from_json() in the Document level API section.
Schemas and XML instances processing are based on the class XMLResource, that handles the loading and the
iteration of XSD/XML data. Starting from v1.3.0 XMLResource has been empowered with ElementTree-like XPath
API. From the same release a new class xmlschema.XmlDocument is available for representing XML resources
with a related schema:
>>> xml_document.schema
XMLSchema10(name='vehicles.xsd', namespace='http://example.com/vehicles')
This class can be used to derive specialized schema-related classes. See WSDL 1.1 documents section for an application
example.
10 Chapter 2. Usage
CHAPTER 3
Other features
Schema objects and package APIs include a set of other features that have been added since a specific release. These
features are regulated by arguments, alternative classes or module parameters.
From release v1.0.14 XSD 1.1 support has been added to the library through the class XMLSchema11. You have
to use this class for XSD 1.1 schemas instead the default class XMLSchema, that is linked to XSD 1.0 validator
XMLSchema10.
The XSD 1.1 validator can be used also for validating XSD 1.0 schemas, except for a restricted set of cases related to
content extension in a complexType (the extension of a complex content with simple base is allowed in XSD 1.0 and
forbidden in XSD 1.1).
Starting from the version v1.2.0 the package has a CLI interface with three console scripts:
xmlschema-validate Validate a set of XML files.
xmlschema-xml2json Decode a set of XML files to JSON.
xmlschema-json2xml Encode a set of JSON files to XML.
Since the version v0.9.10 the library uses XSD validation modes strict/lax/skip, both for schemas and for XML in-
stances. Each validation mode defines a specific behaviour:
strict Schemas are validated against the meta-schema. The processor stops when an error is found in a schema or
during the validation/decode of XML data.
11
xmlschema Documentation, Release 1.5.0
lax Schemas are validated against the meta-schema. The processor collects the errors and continues, eventually
replacing missing parts with wildcards. Undecodable XML data are replaced with None.
skip Schemas are not validated against the meta-schema. The processor doesn’t collect any error. Undecodable XML
data are replaced with the original text.
The default mode is strict, both for schemas and for XML data. The mode is set with the validation argument,
provided when creating the schema instance or when you want to validate/decode XML data. For example you can
build a schema using a strict mode and then decode XML data using the validation argument setted to ‘lax’.
Note: From release v1.1.1 the iter_decode() and iter_encode() methods propagate errors also for skip validation
mode. The errors generated in skip mode are discarded by the top-level methods decode() and encode().
From release v1.0.12 the document validation and the decoding API have an optional argument lazy=False, that can be
changed to True for operating with a lazy XMLResource. The lazy mode can be useful for validating and decoding
big XML data files, consuming less memory.
From release v1.1.0 the lazy mode can be also set with a non negative integer. A zero is equivalent to False, a positive
value means that lazy mode is activated and defines also the lazy depth to use for traversing the XML data tree.
Lazy mode works better with validation because is not needed to use converters for shaping decoded data.
The XML data resource loading is protected using the SafeXMLParser class, a subclass of the pure Python version of
XMLParser that forbids the use of entities. The protection is applied both to XSD schemas and to XML data. The
usage of this feature is regulated by the XMLSchema’s argument defuse.
For default this argument has value ‘remote’ that means the protection on XML data is applied only to data loaded
from remote. Other values for this argument can be ‘always’ and ‘never’.
From release v1.2.0 the schema class includes an argument named allow for protecting the access to XML resources
identified by an URL. For default all types of URLs are allowed. Provide a different value to restrict the set of URLs
that the schema instance can access:
remote Only remote resource URLs are allowed.
local Only file paths and file-related URLs are allowed.
sandbox Allows only the file paths and URLs that are under the directory path identified by source argument or
base_url argument.
From release v1.0.16 a module has been added in order to group constants that define processing limits, generally to
protect against attacks prepared to exhaust system resources. These limits usually don’t need to be changed, but this
possibility has been left at the module level for situations where a different setting is needed.
Model groups of the schemas are checked against restriction violations and Unique Particle Attribution viola-
tions. To avoids XSD model recursion attacks a depth limit of 15 levels is set. If this limit is exceeded an
XMLSchemaModelDepthError is raised, the error is caught and a warning is generated. If you need to set an
higher limit for checking all your groups you can import the library and change the value of MAX_MODEL_DEPTH in
the limits module:
A limit of 9999 on maximum depth is set for XML validation/decoding/encoding to avoid attacks based on extremely
deep XML data. To increase or decrease this limit change the value of MAX_XML_DEPTH in the module limits after
the import of the package:
XML data decoding and encoding is handled using an intermediate converter class instance that takes charge of
composing inner data and mapping of namespaces and prefixes.
Because XML is a structured format that includes data and metadata information, as attributes and namespace decla-
rations, is necessary to define conventions for naming the different data objects in a distinguishable way. For example
a wide-used convention is to prefixing attribute names with an ‘@’ character. With this convention the attribute
name=’John’ is decoded to ‘@name’: ‘John’, or ‘level=’10’ is decoded to ‘@level’: 10.
A related topic is the mapping of namespaces. The expanded namespace representation is used within XML objects
of the ElementTree library. For example {http://www.w3.org/2001/XMLSchema}string is the fully qualified name of
the XSD string type, usually referred as xs:string or xsd:string with a namespace declaration. With string serialization
of XML data the names are remapped to prefixed format. This mapping is generally useful also if you serialize XML
data to another format like JSON, because prefixed name is more manageable and readable than expanded format.
The library includes some converters. The default converter XMLSchemaConverter is the base class of other
converter types. Each derived converter type implements a well know convention, related to the conversion from
XML to JSON data format:
• ParkerConverter: Parker convention
• BadgerFishConverter: BadgerFish convention
• AbderaConverter: Apache Abdera project convention
• JsonMLConverter: JsonML (JSON Mark-up Language) convention
A summary of these and other conventions can be found on the wiki page JSON and XML Conversion.
The base class, that not implements any particular convention, has several options that can be used to variate the con-
verting process. Some of these options are not used by other predefined converter types (eg. force_list and force_dict)
or are used with a fixed value (eg. text_key or attr_prefix). See Converters API for details about base class options and
attributes.
15
xmlschema Documentation, Release 1.5.0
Moreover there are also other two converters useful for specific cases:
• UnorderedConverter: like default converter but with unordered decoding and encoding.
• ColumnarConverter: a converter that remaps attributes as child elements in a columnar shape (available
since release v1.2.0).
• DataElementConverter: a converter that converts XML to a tree of DataElement intances, Element-
like objects with decoded values and schema bindings (available since release v1.5.0).
To create a new customized converter you have to subclass the XMLSchemaConverter and redefine the two meth-
ods element_decode and element_encode. These methods are based on the namedtuple ElementData, an Element-like
data structure that stores the decoded Element parts. This namedtuple is used by decoding and encoding methods as
an intermediate data structure.
The namedtuple ElementData has four attributes:
• tag: the element’s tag string;
• text: the element’s text, that can be a string or None for empty elements;
• content: the element’s children, can be a list or None;
• attributes: the element’s attributes, can be a dictionary or None.
The method element_decode receives as first argument an ElementData instance with decoded data. The other argu-
ments are the XSD element to use for decoding and the level of the XML decoding process, used to add indent spaces
for a readable string serialization. This method uses the input data element to compose a decoded data, typically a
dictionary or a list or a value for simple type elements.
On the opposite the method element_encode receives the decoded object and decompose it in order to get and returns
an ElementData instance. This instance has to contain the parts of the element that will be then encoded an used to
build an XML Element instance.
These two methods have also the responsibility to map and unmap object names, but don’t have to decode or encode
data, a task that is delegated to the methods of the XSD components.
Depending on the format defined by your new converter class you may provide a different value for properties lossless
and losslessly. The lossless has to be True if your new converter class preserves all XML data information (eg. as the
BadgerFish convention). Your new converter can be also losslessly if it’s lossless and the element model structure and
order is maintained (like the JsonML convention).
Furthermore your new converter class can has a more specific __init__ method in order to avoid the usage of unused
options or to set the value of some other options. Finally refer also to the code of predefined derived converters to see
how you can build your own one.
Schema components
After the building a schema object contains a set of components that represent the definitions/declarations defined in
loaded schema files. These components, sometimes referred as Post Schema Validation Infoset or PSVI, constitute an
augmentation of the original information contained into schema files.
Taking the collection.xsd as sample schema to illustrate the access to components, we can iterate the entire set of
components, globals an locals, using the iter_components() generator function:
>>> import xmlschema
>>> schema = xmlschema.XMLSchema('tests/test_cases/examples/collection/collection.xsd
˓→')
17
xmlschema Documentation, Release 1.5.0
Another method for retrieving XSD elements and attributes of a schema is to use XPath expressions with find or findall
methods:
>>> schema.elements['person']
XsdElement(name='person', occurs=[1, 1])
>>> schema.types['personType']
XsdComplexType(name='personType')
The schema object has a dictionary attribute for each type of XSD declarations (elements, attributes and notations) and
for each type of XSD definitions (types, model groups, attribute groups, identity constraints and substitution groups).
These dictionaries are only views of common dictionaries, shared by all the loaded schemas in a structure called maps:
>>> schema.maps
XsdGlobals(validator=XMLSchema10(name='collection.xsd', ...)
XsdElement The XSD 1.0 element class, base also of XSD 1.1 element class.
XsdAttribute The XSD 1.0 attribute class, base also of XSD 1.1 attribute class.
The full schema components are provided only by accessing the xmlschema.validators subpackage, for example:
Every component is linked to its container schema and a reference node of its XSD schema document:
A component that has a name (eg. elements or global types) can be referenced with a different name format, so there
are some properties for getting these formats:
Those methods can be used to decode the correspondents parts of the XML document:
{'@xmlns:vh': 'http://example.com/vehicles',
'vh:bike': [{'@make': 'Harley-Davidson', '@model': 'WL'},
{'@make': 'Yamaha', '@model': 'XS650'}]}
Every element or attribute declaration has a type attribute for accessing its XSD type:
Simple types are used on attributes and elements that contains a text value:
A simple type doesn’t have attributes but can have facets-related validators or properties:
>>> schema.attributes['step'].type.attributes
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'XsdAtomicBuiltin' object has no attribute 'attributes'
>>> schema.attributes['step'].type.validators
[<function positive_int_validator at ...>]
>>> schema.attributes['step'].type.white_space
'collapse'
>>> schema.attributes['step'].type.is_simple()
True
Complex types are used only for elements with attributes or with child elements.
For accessing the attributes there is always defined and attribute group, also when the complex type has no attributes:
>>> schema.types['objType']
XsdComplexType(name='objType')
>>> schema.types['objType'].attributes
XsdAttributeGroup(['id', 'available'])
>>> schema.types['objType'].attributes['available']
XsdAttribute(name='available')
For accessing the content model there use the attribute content. In most cases the element’s type is a complexType
with a complex content and in these cases content is a not-empty XsdGroup:
Note: The attribute content_type has been renamed to content in v1.2.1 in order to avoid confusions between the
complex type and its content. A property with the old name will be maintained until v1.3.0
Model groups can be nested with very complex structures, so there is an generator function iter_elements() to traverse
a model group:
Sometimes a complex type can have a simple content, in these cases content is a simple type.
For attributes only empty or simple content types are possible, because they can have only a simpleType value.
The reference methods for checking the content type are respectively is_empty(), has_simple_content(),
is_element_only() and has_mixed_content().
The content type checking can be complicated if you want to know which is the content validator without use a type
checking. To making this simpler there are two properties defined for XSD types:
simple_type a simple type in case of simple content or when an empty content is based on an empty simple type,
None otherwise.
model_group a model group in case of mixed or element-only content or when an empty content is based on an empty
model group, None otherwise.
Testing
The tests of the xmlschema library are implemented using the Python’s unitest library. From version v1.1.0 the test
scripts have been moved into the directory tests/ of the source distribution. Only a small subpackage extras/testing/,
containing a specialized UnitTest subclass, a factory and builders for creating test classes for XSD and XML file, has
been left into the package’s code.
There are several test scripts, each one for a different target. These scripts can be run individually or by the unittest
module. For example to run XPath tests through the unittest module use the command:
OK
The same run can be launched with the command $ python tests/test_xpath.py but an additional header, containing
info about the package location, the Python version and the machine platform, is displayed before running the tests.
Under the base directory tests/ there are the test scripts for the base modules of the package. The subdirectory
tests/validators includes tests for XSD validators building (schemas and their components) and the subdirectory
tests/validation contains tests validation of XSD/XML and decoding/encoding of XML files.
To run all tests use the command python -m unittest ‘. Also, the script *test_all.py* can launched during development
to run all the tests except memory and packaging tests. From the project source base, if you have the *tox automation
tool* installed, you can run all tests with all supported Python’s versions using the command ‘‘tox‘.
23
xmlschema Documentation, Release 1.5.0
Three scripts (test_all.py, test_schemas.py, test_validation.py) create many tests dinamically, building test classes from
a set of XSD/XML files. Only a small set of test files is published in the repository for copyright reasons. You can find
the repository test files into tests/test_cases/ subdirectory.
You can locally extend the test with your set of files. For doing this create a submodule or a directory outside the
repository directory and then copy your XSD/XML files into it. Create an index file called testfiles into the base
directory were you put your cases and fill it with the list of paths of files you want to be tested, one per line, as in the
following example:
# XHTML
XHTML/xhtml11-mod.xsd
XHTML/xhtml-datatypes-1.xsd
# Quantum Espresso
qe/qes.xsd
qe/qes_neb.xsd
qe/qes_with_choice_no_nesting.xsd
qe/silicon.xml
qe/silicon-1_error.xml --errors 1
qe/silicon-3_errors.xml --errors=3
qe/SrTiO_3.xml
qe/SrTiO_3-2_errors.xml --errors 2
The test scripts create a test for each listed file, dependant from the context. For example the script test_schemas.py
uses only .xsd files, where instead the script tests_validation.py uses only .xml files.
If a file has errors insert an integer number after the path. This is the number of errors that the XML Schema validator
have to found to pass the test.
From version 1.0.0 each test-case line is parsed for those additional arguments:
-L URI URL Schema location hint overrides.
–version=VERSION XSD schema version to use for the test case (default is 1.0).
–errors=NUM Number of errors expected (default=0).
–warnings=NUM Number of warnings expected (default=0).
–inspect Inspect using an observed custom schema class.
–defuse=(always, remote, never) Define when to use the defused XML data loaders.
–timeout=SEC Timeout for fetching resources (default=300).
–lax-encode Use lax mode on encode checks (for cases where test data uses default or fixed values or some test data
are skipped by wildcards processContents). Ignored on schema tests.
–debug Activate the debug mode (only the cases with –debug are executed).
If you put a --help on the first case line the argument parser show you all the options available.
To run tests with also your personal set of files you have provide the path to your custom testfile, index, for example:
24 Chapter 6. Testing
xmlschema Documentation, Release 1.5.0
6.3 Testing with the W3C XML Schema 1.1 test suite
From release v1.0.11, using the script test_w3c_suite.py, you can run also tests based on the W3C XML Schema 1.1
test suite. To run these tests clone the W3C repo on the project’s parent directory and than run the script:
You can also provides additional options for select a subset of W3C tests, run test_w3_suite.py --help to
show available options.
From release v1.0.12, using the script test_files.py, you can test schemas or XML instances passing them as arguments:
$ cd tests/
$ python test_files.py test_cases/examples/vehicles/*.xsd
Add test 'TestSchema001' for file 'test_cases/examples/vehicles/bikes.xsd' ...
Add test 'TestSchema002' for file 'test_cases/examples/vehicles/cars.xsd' ...
Add test 'TestSchema003' for file 'test_cases/examples/vehicles/types.xsd' ...
Add test 'TestSchema004' for file 'test_cases/examples/vehicles/vehicles-max.xsd' ...
Add test 'TestSchema005' for file 'test_cases/examples/vehicles/vehicles.xsd' ...
.....
----------------------------------------------------------------------
Ran 5 tests in 0.147s
OK
6.3. Testing with the W3C XML Schema 1.1 test suite 25
xmlschema Documentation, Release 1.5.0
26 Chapter 6. Testing
CHAPTER 7
Extra features
The subpackage xmlschema.extras contThe subpackage xmlschema.extras acts as a container of a set of extra modules
or subpackages that can be useful for specific needs.
These codes are not imported during normal library usage and may require additional dependencies to be installed.
This choice should be facilitate the implementation of other optional functionalities without having an impact on the
base configuration.
The module xmlschema.extras.codegen provides an abstract base class AbstractGenerator for generate source
code from parsed XSD schemas. The Jinja2 engine is embedded in that class and empowered with a set of custom
filters and tests for accessing to defined XSD schema components.
Within templates you can use a set of addional filters, available for all generator subclasses:
name Get the unqualified name of the object. Invalid chars for identifiers are replaced by an underscore.
qname Get the QName of the object in prefixed form. Invalid chars for identifiers are replaced by an underscore.
namespace Get the namespace URI of the XSD component.
type_name Get the unqualified name of an XSD type. For default ‘Type’ or ‘_type’ suffixes are removed. Invalid
chars for identifiers are replaced by an underscore.
type_qname Get the QName of an XSD type in prefixed form. For default ‘Type’ or ‘_type’ suffixes are removed.
Invalid chars for identifiers are replaced by an underscore.
sort_types Sort a sequence or a map of XSD types, in reverse dependency order, detecting circularities.
27
xmlschema Documentation, Release 1.5.0
Each implementation of a generator class has an additional filter for translating types using the types map of the
instance. For example a PythonGenerator has the filter python_type.
These filters are based on a common method map_type that uses an instance dictionary built at initialization time from
a class maps for builtin types and an optional initialization argument for the types defined in the schema.
Defining a generator class you can add filters and tests using filter_method and test_method decorators:
The module xmlschema.extras.wsdl provides a specialized schema-related XML document for WSDL 1.1.
An example of specialization is the class Wsdl11Document, usable for validating and parsing WSDL 1.1 docu-
ments, that can be imported from wsdl module of the extra subpackage:
>>> wsdl_document.schema
XMLSchema10(name='wsdl.xsd', namespace='http://schemas.xmlsoap.org/wsdl/')
A parsed WSDL 1.1 document can aggregate a set of WSDL/XSD files for building interrelated set of definitions in
multiple namespaces. The XMLResource base class and schema validation assure a fully checked WSDL document
with protections against XML attacks. See xmlschema.extras.wsdl.Wsdl11Document API for details.
Package API
exception XMLSchemaException
The base exception that let you catch all the errors generated by the library.
exception XMLResourceError
Raised when an error is found accessing an XML resource.
exception XMLSchemaNamespaceError
Raised when a wrong runtime condition is found with a namespace.
exception XMLSchemaValidatorError(validator, message, elem=None, source=None, names-
paces=None)
Base class for XSD validator errors.
Parameters
• validator (XsdValidator or function) – the XSD validator.
• message (str or unicode) – the error message.
• elem (Element) – the element that contains the error.
• source (XMLResource) – the XML resource that contains the error.
• namespaces (dict) – is an optional mapping from namespace prefix to URI.
Variables path – the XPath of the element, calculated when the element is set or the XML resource
is set.
exception XMLSchemaNotBuiltError(validator, message)
Raised when there is an improper usage attempt of a not built XSD validator.
Parameters
• validator (XsdValidator) – the XSD validator.
• message (str or unicode) – the error message.
29
xmlschema Documentation, Release 1.5.0
• cls – class to use for building the schema instance (for default XMLSchema10 is used).
• path – is an optional XPath expression that matches the elements of the XML data that
have to be decoded. If not provided the XML root element is used.
• schema_path – an XPath expression to select the XSD element to use for decoding. If
not provided the path argument or the source root tag are used.
• use_defaults – defines when to use element and attribute defaults for filling missing
required values.
• namespaces – is an optional mapping from namespace prefix to URI.
• locations – additional schema location hints, used if a schema instance has to be built.
• base_url – is an optional custom base URL for remapping relative locations, for default
uses the directory where the XSD or alternatively the XML document is located.
• defuse – optional argument to pass for construct schema and XMLResource instances.
• timeout – optional argument to pass for construct schema and XMLResource instances.
• lazy – optional argument for construct the XMLResource instance.
is_valid(xml_document, schema=None, cls=None, path=None, schema_path=None, use_defaults=True,
namespaces=None, locations=None, base_url=None, defuse=’remote’, timeout=300,
lazy=False)
Like validate() except that do not raises an exception but returns True if the XML document is valid,
False if it’s invalid.
iter_errors(xml_document, schema=None, cls=None, path=None, schema_path=None,
use_defaults=True, namespaces=None, locations=None, base_url=None, defuse=’remote’,
timeout=300, lazy=False)
Creates an iterator for the errors generated by the validation of an XML document. Takes the same arguments
of the function validate().
to_dict(xml_document, schema=None, cls=None, path=None, process_namespaces=True, loca-
tions=None, base_url=None, defuse=’remote’, timeout=300, lazy=False, **kwargs)
Decodes an XML document to a Python’s nested dictionary. The decoding is based on an XML
Schema class instance. For default the document is validated during the decoding phase. Raises an
XMLSchemaValidationError if the XML document is not validated against the schema.
Parameters
• xml_document – can be an XMLResource instance, a file-like object a path to a file
or an URI of a resource or an Element instance or an ElementTree instance or a string
containing the XML data. If the passed argument is not an XMLResource instance a new
one is built using this and defuse, timeout and lazy arguments.
• schema – can be a schema instance or a file-like object or a file path or a URL of a resource
or a string containing the schema.
• cls – class to use for building the schema instance (for default uses XMLSchema10).
• path – is an optional XPath expression that matches the elements of the XML data that
have to be decoded. If not provided the XML root element is used.
• process_namespaces – indicates whether to use namespace information in the decod-
ing process.
• locations – additional schema location hints, in case a schema instance has to be built.
• base_url – is an optional custom base URL for remapping relative locations, for default
uses the directory where the XSD or alternatively the XML document is located.
• defuse – optional argument to pass for construct schema and XMLResource instances.
• timeout – optional argument to pass for construct schema and XMLResource instances.
• lazy – optional argument for construct the XMLResource instance.
• kwargs – other optional arguments of XMLSchema.iter_decode() as keyword ar-
guments.
Returns an object containing the decoded data. If validation='lax' keyword argument is
provided the validation errors are collected and returned coupled in a tuple with the decoded
data.
Raises XMLSchemaValidationError if the object is not decodable by the XSD component,
or also if it’s invalid when validation='strict' is provided.
to_json(xml_document, fp=None, schema=None, cls=None, path=None, converter=None, pro-
cess_namespaces=True, locations=None, base_url=None, defuse=’remote’, timeout=300,
lazy=False, json_options=None, **kwargs)
Serialize an XML document to JSON. For default the XML data is validated during the decoding phase. Raises
an XMLSchemaValidationError if the XML document is not validated against the schema.
Parameters
• xml_document – can be an XMLResource instance, a file-like object a path to a file
or an URI of a resource or an Element instance or an ElementTree instance or a string
containing the XML data. If the passed argument is not an XMLResource instance a new
one is built using this and defuse, timeout and lazy arguments.
• fp – can be a write() supporting file-like object.
• schema – can be a schema instance or a file-like object or a file path or an URL of a
resource or a string containing the schema.
• cls – schema class to use for building the instance (for default uses XMLSchema10).
• path – is an optional XPath expression that matches the elements of the XML data that
have to be decoded. If not provided the XML root element is used.
• converter – an XMLSchemaConverter subclass or instance to use for the decoding.
• process_namespaces – indicates whether to use namespace information in the decod-
ing process.
• locations – additional schema location hints, in case the schema instance has to be built.
• base_url – is an optional custom base URL for remapping relative locations, for default
uses the directory where the XSD or alternatively the XML document is located.
• defuse – optional argument to pass for construct schema and XMLResource instances.
• timeout – optional argument to pass for construct schema and XMLResource instances.
• lazy – optional argument for construct the XMLResource instance.
• json_options – a dictionary with options for the JSON serializer.
• kwargs – optional arguments of XMLSchema.iter_decode() as keyword arguments
to variate the decoding process.
Returns a string containing the JSON data if fp is None, otherwise doesn’t return anything. If
validation='lax' keyword argument is provided the validation errors are collected and
returned, eventually coupled in a tuple with the JSON data.
Raises XMLSchemaValidationError if the object is not decodable by the XSD component,
or also if it’s invalid when validation='strict' is provided.
version
The schema’s version attribute, defaults to None.
schema_location
A list of location hints extracted from the xsi:schemaLocation attribute of the schema.
no_namespace_schema_location
A location hint extracted from the xsi:noNamespaceSchemaLocation attribute of the schema.
target_prefix
The prefix associated to the targetNamespace.
default_namespace
The namespace associated to the empty prefix ‘’.
base_url
The base URL of the source of the schema.
root_elements
The list of global elements that are not used by reference in any model of the schema. This is implemented
as lazy property because it’s computationally expensive to build when the schema model is complex.
simple_types
Returns a list containing the global simple types.
complex_types
Returns a list containing the global complex types.
classmethod builtin_types()
Returns the XSD built-in types of the meta-schema.
classmethod create_meta_schema(source=None, base_schemas=None, global_maps=None)
Creates a new meta-schema instance.
Parameters
• source – an optional argument referencing to or containing the XSD meta-schema re-
source. Required if the schema class doesn’t already have a meta-schema.
• base_schemas – an optional dictionary that contains namespace URIs and schema lo-
cations. If provided it’s used as substitute for class ‘s BASE_SCHEMAS. Also a sequence
of (namespace, location) items can be provided if there are more schema documents for
one or more namespaces.
• global_maps – is an optional argument containing an XsdGlobals instance for the
new meta schema. If not provided a new map is created.
create_any_content_group(parent, any_element=None)
Creates a model group related to schema instance that accepts any content.
Parameters
• parent – the parent component to set for the any content group.
• any_element – an optional any element to use for the content group. When pro-
vided it’s copied, linked to the group and the minOccurs/maxOccurs are set to 0 and
‘unbounded’.
create_any_attribute_group(parent)
Creates an attribute group related to schema instance that accepts any attribute.
Parameters parent – the parent component to set for the any attribute group.
create_any_type()
Creates an xs:anyType equivalent type related with the wildcards connected to global maps of the schema
instance in order to do a correct namespace lookup during wildcards validation.
get_locations(namespace)
Get a list of location hints for a namespace.
include_schema(location, base_url=None)
Includes a schema for the same namespace, from a specific URL.
Parameters
• location – is the URL of the schema.
• base_url – is an optional base URL for fetching the schema resource.
Returns the included XMLSchema instance.
import_schema(namespace, location, base_url=None, force=False, build=False)
Imports a schema for an external namespace, from a specific URL.
Parameters
• namespace – is the URI of the external namespace.
• location – is the URL of the schema.
• base_url – is an optional base URL for fetching the schema resource.
• force – if set to True imports the schema also if the namespace is already imported.
• build – defines when to build the imported schema, the default is to not build.
Returns the imported XMLSchema instance.
export(target, only_relative=True)
Exports a schema instance. The schema instance is exported to a directory with also the hierarchy of
imported/included schemas.
Parameters
• target – a path to a local empty directory.
• only_relative – for default only loaded schemas referred by a relative location are
saved. If False is provided all the loaded schemas are saved.
resolve_qname(qname, namespace_imported=True)
QName resolution for a schema instance.
Parameters
• qname – a string in xs:QName format.
• namespace_imported – if this argument is True raises an XMLSchemaNamespaceEr-
ror if the namespace of the QName is not the targetNamespace and the namespace is not
imported by the schema.
Returns an expanded QName in the format “{namespace-URI}*local-name*”.
Raises XMLSchemaValueError for an invalid xs:QName is found, XMLSchemaKeyError if the
namespace prefix is not declared in the schema instance.
iter_globals(schema=None)
Creates an iterator for XSD global definitions/declarations related to schema namespace.
Parameters schema – Optional argument for filtering only globals related to a schema in-
stance.
iter_components(xsd_classes=None)
Iterates yielding the schema and its components. For default includes all the relevant components of the
schema, excluding only facets and empty attribute groups. The first returned component is the schema
itself.
Parameters xsd_classes – provide a class or a tuple of classes to restrict the range of com-
ponent types yielded.
classmethod check_schema(schema, namespaces=None)
Validates the given schema against the XSD meta-schema (meta_schema).
Parameters
• schema – the schema instance that has to be validated.
• namespaces – is an optional mapping from namespace prefix to URI.
Raises XMLSchemaValidationError if the schema is invalid.
build()
Builds the schema’s XSD global maps.
clear()
Clears the schema’s XSD global maps.
built
Property that is True if XSD validator has been fully parsed and built, False otherwise. For schemas
the property is checked on all global components. For XSD components check only the building of local
subcomponents.
validation_attempted
Property that returns the validation status of the XSD validator. It can be ‘full’, ‘partial’ or ‘none’.
https://www.w3.org/TR/xmlschema-1/#e-validation_attempted
https://www.w3.org/TR/2012/REC-xmlschema11-1-20120405/#e-validation_attempted
validity
Property that returns the XSD validator’s validity. It can be ‘valid’, ‘invalid’ or ‘notKnown’.
https://www.w3.org/TR/xmlschema-1/#e-validity
https://www.w3.org/TR/2012/REC-xmlschema11-1-20120405/#e-validity
all_errors
A list with all the building errors of the XSD validator and its components.
get_converter(converter=None, **kwargs)
Returns a new converter instance.
Parameters
• converter – can be a converter class or instance. If it’s an instance the new instance is
copied from it and configured with the provided arguments.
• kwargs – optional arguments for initialize the converter instance.
Returns a converter instance.
True xs:hexBinary and xs:base64Binary types are decoded, otherwise their origin XML string is returned.
:param converter: an XMLSchemaConverter subclass or instance to use for decoding. :param filler: an
optional callback function to fill undecodable data with a typed value. The callback function must accept
one positional argument, that can be an XSD Element or an attribute declaration. If not provided undecod-
able data is replaced by None. :param fill_missing: if set to True the decoder fills also missing attributes.
The filling value is None or a typed value if the filler callback is provided. :param keep_unknown: if set
to True unknown tags are kept and are decoded with xs:anyType. For default unknown tags not decoded
by a wildcard are discarded. :param max_depth: maximum level of decoding, for default there is no limit.
:param depth_filler: an optional callback function to replace data over the max_depth level. The callback
function must accept one positional argument, that can be an XSD Element. If not provided deeper data
are replaced with None values. :param kwargs: keyword arguments with other options for converter and
decoder. :return: yields a decoded data object, eventually preceded by a sequence of validation or decoding
errors.
encode(obj, path=None, validation=’strict’, *args, **kwargs)
Encodes to XML data. Takes the same arguments of the method XMLSchema.iter_encode().
Returns An ElementTree’s Element or a list containing a sequence of ElementTree’s elements
if the argument path matches multiple XML data chunks. If validation argument is ‘lax’ a
2-items tuple is returned, where the first item is the encoded object and the second item is a
list containing the errors.
iter_encode(obj, path=None, validation=’lax’, namespaces=None, use_defaults=True, con-
verter=None, unordered=False, **kwargs)
Creates an iterator for encoding a data structure to an ElementTree’s Element.
Parameters
• obj – the data that has to be encoded to XML data.
• path – is an optional XPath expression for selecting the element of the schema that
matches the data that has to be encoded. For default the first global element of the schema
is used.
• validation – the XSD validation mode. Can be ‘strict’, ‘lax’ or ‘skip’.
• namespaces – is an optional mapping from namespace prefix to URI.
• use_defaults – whether to use default values for filling missing data.
• converter – an XMLSchemaConverter subclass or instance to use for the encoding.
• unordered – a flag for explicitly activating unordered encoding mode for content model
data. This mode uses content models for a reordered-by-model iteration of the child ele-
ments.
• kwargs – Keyword arguments containing options for converter and encoding.
Returns yields an Element instance/s or validation/encoding errors.
• validation – the XSD validation mode to use, can be ‘strict’, ‘lax’ or ‘skip’.
build()
Build the maps of XSD global definitions/declarations. The global maps are updated adding and building
the globals of not built registered schemas.
check(schemas=None, validation=’strict’)
Checks the global maps. For default checks all schemas and raises an exception at first error.
Parameters
• schemas – optional argument with the set of the schemas to check.
• validation – overrides the default validation mode of the validator.
Raise XMLSchemaParseError
clear(remove_schemas=False, only_unbuilt=False)
Clears the instance maps and schemas.
Parameters
• remove_schemas – removes also the schema instances.
• only_unbuilt – removes only not built objects/schemas.
copy(validator=None, validation=None)
Makes a copy of the object.
iter_globals()
Creates an iterator for XSD global definitions/declarations.
iter_schemas()
Creates an iterator for the schemas registered in the instance.
lookup(tag, qname)
General lookup method for XSD global components.
Parameters
• tag – the expanded QName of the XSD the global declaration/definition (eg. ‘{http:
//www.w3.org/2001/XMLSchema}element’), that is used to select the global map for
lookup.
• qname – the expanded QName of the component to be looked-up.
Returns an XSD global component.
Raises an XMLSchemaValueError if the tag argument is not appropriate for a global component,
an XMLSchemaKeyError if the qname argument is not found in the global map.
register(schema)
Registers an XMLSchema instance.
unbuilt
Property that returns a list with unbuilt components.
The base class XMLSchemaConverter is used for defining generic converters. The subclasses implement some of the
most used conventions for converting XML to JSON data.
• level – the level related to the encoding process (0 means the root).
Returns an ElementData instance.
map_qname(qname)
Converts an extended QName to the prefixed format. Only registered namespaces are mapped.
Parameters qname – a QName in extended format or a local name.
Returns a QName in prefixed format or a local name.
unmap_qname(qname, name_table=None)
Converts a QName in prefixed format or a local name to the extended QName format. Local names are
converted only if a default namespace is included in the instance. If a name_table is provided a local name
is mapped to the default namespace only if not found in the name table.
Parameters
• qname – a QName in prefixed format or a local name
• name_table – an optional lookup table for checking local names.
Returns a QName in extended format or a local name.
class UnorderedConverter(namespaces=None, dict_class=None, list_class=None,
etree_element_class=None, text_key=’$’, attr_prefix=’@’,
cdata_prefix=None, indent=4, strip_namespaces=False, pre-
serve_root=False, force_dict=False, force_list=False, **kwargs)
Same as XMLSchemaConverter but element_encode() returns a dictionary for the content of the el-
ement, that can be used directly for unordered encoding mode. In this mode the order of the elements in the
encoded output is based on the model visitor pattern rather than the order in which the elements were added
to the input dictionary. As the order of the input dictionary is not preserved, character data between sibling
elements are interleaved between tags.
class ParkerConverter(namespaces=None, dict_class=None, list_class=None, preserve_root=False,
**kwargs)
XML Schema based converter class for Parker convention.
ref: http://wiki.open311.org/JSON_and_XML_Conversion/#the-parker-convention ref: https://developer.
mozilla.org/en-US/docs/Archive/JXON#The_Parker_Convention
Parameters
• namespaces – Map from namespace prefixes to URI.
• dict_class – Dictionary class to use for decoded data. Default is dict.
• list_class – List class to use for decoded data. Default is list.
• preserve_root – If True the root element will be preserved. For default the Parker
convention remove the document root element, returning only the value.
class BadgerFishConverter(namespaces=None, dict_class=None, list_class=None, **kwargs)
XML Schema based converter class for Badgerfish convention.
ref: http://www.sklar.com/badgerfish/ ref: http://badgerfish.ning.com/
Parameters
• namespaces – Map from namespace prefixes to URI.
• dict_class – Dictionary class to use for decoded data. Default is dict.
• list_class – List class to use for decoded data. Default is list.
Parameters
• source – a string containing the XML document or file path or an URL or a file like object
or an ElementTree or an Element.
• base_url – is an optional base URL, used for the normalization of relative paths when
the URL of the resource can’t be obtained from the source argument. For security access to
a local file resource is always denied if the base_url is a remote URL.
• allow – defines the security mode for accessing resource locations. Can be ‘all’, ‘remote’,
‘local’ or ‘sandbox’. Default is ‘all’ that means all types of URLs are allowed. With ‘remote’
only remote resource URLs are allowed. With ‘local’ only file paths and URLs are allowed.
With ‘sandbox’ only file paths and URLs that are under the directory path identified by the
base_url argument are allowed.
• defuse – defines when to defuse XML data using a SafeXMLParser. Can be ‘always’,
‘remote’ or ‘never’. For default defuses only remote XML data.
• timeout – the timeout in seconds for the connection attempt in case of remote data.
• lazy – if a value False or 0 is provided the XML data is fully loaded into and processed
from memory. For default only the root element of the source is loaded, except in case the
source argument is an Element or an ElementTree instance. A positive integer also defines
the depth at which the lazy resource can be better iterated (True means 1).
root
The XML tree root Element.
text
The XML text source, None if it’s not available.
url
The source URL, None if the instance is created from an Element tree or from a string.
base_url
The effective base URL used for completing relative locations.
namespace
The namespace of the XML resource.
parse(source, lazy=False)
tostring(indent=”, max_lines=None, spaces_for_tab=4, xml_declaration=False)
Generates a string representation of the XML resource.
open()
Returns a opened resource reader object for the instance URL. If the source attribute is a seekable file-like
object rewind the source and return it.
load()
Loads the XML text from the data source. If the data source is an Element the source XML text can’t be
retrieved.
is_lazy()
Returns True if the XML resource is lazy.
lazy_depth
The optimal depth for validate this resource. Is a positive integer for lazy resources and 0 for fully loaded
XML trees.
is_remote()
Returns True if the resource is related with remote XML data.
is_local()
Returns True if the resource is related with local XML data.
is_loaded()
Returns True if the XML text of the data source is loaded.
iter(tag=None, nsmap=None)
XML resource tree iterator. The iteration of a lazy resource is in reverse order (top level element is the
last). If tag is not None or ‘*’, only elements whose tag equals tag are returned from the iterator. Provide
a nsmap list for tracking the namespaces of yielded elements. If nsmap is a dictionary the tracking of
namespaces is cumulative on the whole tree, renaming prefixes in case of conflicts.
iter_depth(mode=1, nsmap=None, ancestors=None)
Iterates XML subtrees. For fully loaded resources yields the root element. On lazy resources the argument
mode can change the sequence and the completeness of yielded elements. There are four possible modes,
that generate different sequences of elements:
1. Only the elements at depth_level level of the tree
2. Only a root element pruned at depth_level
3. The elements at depth_level and then a pruned root
4. An incomplete root at start, the elements at depth_level and a pruned root
Parameters
• mode – an integer in range [1..4] that defines the iteration mode.
• nsmap – provide a list/dict for tracking the namespaces of yielded elements. If a list is
passed the tracking is done at element level, otherwise the tracking is on the whole tree,
renaming prefixes in case of conflicts.
• ancestors – provide a list for tracking the ancestors of yielded elements.
prefix. The empty prefix ‘’ is used only if it’s declared at root level to avoid erroneous mapping of local
names. In other cases uses ‘default’ prefix as substitute.
Parameters
• namespaces – builds the namespace map starting over the dictionary provided.
• root_only – if True, or None and the resource is lazy, extracts only the namespaces
declared in the root element.
Returns a dictionary for mapping namespace prefixes to full URI.
get_locations(locations=None, root_only=None)
Extracts a list of schema location hints from the XML resource. The locations are normalized using the
base URL of the instance.
Parameters
• locations – a sequence of schema location hints inserted before the ones extracted
from the XML resource. Locations passed within a tuple container are not normalized.
• root_only – if True, or if None and the resource is lazy, extracts the location hints of
the root element only.
Returns a list of couples containing normalized location hints.
class XmlDocument(source, schema=None, cls=None, validation=’strict’, namespaces=None, lo-
cations=None, base_url=None, allow=’all’, defuse=’remote’, timeout=300,
lazy=False)
An XML document bound with its schema. If no schema is get from the provided context and validation
argument is ‘skip’ the XML document is associated with a generic schema, otherwise a ValueError is raised.
Parameters
• source – a string containing XML data or a file path or an URL or a file like object or an
ElementTree or an Element.
• schema – can be a xmlschema.XMLSchema instance or a file-like object or a file path
or an URL of a resource or a string containing the XSD schema.
• cls – class to use for building the schema instance (for default XMLSchema10 is used).
• validation – the XSD validation mode to use for validating the XML document, that
can be ‘strict’ (default), ‘lax’ or ‘skip’.
• namespaces – is an optional mapping from namespace prefix to URI.
• locations – resource location hints, that can be a dictionary or a sequence of couples
(namespace URI, resource URL).
• base_url – the base URL for base xmlschema.XMLResource initialization.
• allow – the security mode for base xmlschema.XMLResource initialization.
• defuse – the defuse mode for base xmlschema.XMLResource initialization.
• timeout – the timeout for base xmlschema.XMLResource initialization.
• lazy – the lazy mode for base xmlschema.XMLResource initialization.
class ElementPathMixin
Mixin abstract class for enabling ElementTree and XPath API on XSD components.
Variables
• text – the Element text, for compatibility with the ElementTree API.
• tail – the Element tail, for compatibility with the ElementTree API.
tag
Alias of the name attribute. For compatibility with the ElementTree API.
attrib
Returns the Element attributes. For compatibility with the ElementTree API.
get(key, default=None)
Gets an Element attribute. For compatibility with the ElementTree API.
iter(tag=None)
Creates an iterator for the XSD element and its subelements. If tag is not None or ‘*’, only XSD ele-
ments whose matches tag are returned from the iterator. Local elements are expanded without repetitions.
Element references are not expanded because the global elements are not descendants of other elements.
iterchildren(tag=None)
Creates an iterator for the child elements of the XSD component. If tag is not None or ‘*’, only XSD
elements whose name matches tag are returned from the iterator.
find(path, namespaces=None)
Finds the first XSD subelement matching the path.
Parameters
• path – an XPath expression that considers the XSD component as the root element.
• namespaces – an optional mapping from namespace prefix to namespace URI.
Returns the first matching XSD subelement or None if there is no match.
findall(path, namespaces=None)
Finds all XSD subelements matching the path.
Parameters
• path – an XPath expression that considers the XSD component as the root element.
• namespaces – an optional mapping from namespace prefix to full name.
Returns a list containing all matching XSD subelements in document order, an empty list is
returned if there is no match.
iterfind(path, namespaces=None)
Creates and iterator for all XSD subelements matching the path.
Parameters
• path – an XPath expression that considers the XSD component as the root element.
• namespaces – is an optional mapping from namespace prefix to full name.
Returns an iterable yielding all matching XSD subelements in document order.
Implemented for XSD schemas, elements, attributes, types, attribute groups and model groups.
class ValidationMixin
Mixin for implementing XML data validators/decoders. A derived class must implement the methods
iter_decode and iter_encode.
is_valid(source, use_defaults=True, namespaces=None)
Like validate() except that do not raises an exception but returns True if the XML document is valid,
False if it’s invalid.
Parameters
• source – the source of XML data. For a schema can be a path to a file or an URI of a
resource or an opened file-like object or an Element Tree instance or a string containing
XML data. For other XSD components can be a string for an attribute or a simple type
validators, or an ElementTree’s Element otherwise.
• use_defaults – indicates whether to use default values for filling missing data.
• namespaces – is an optional mapping from namespace prefix to URI.
validate(source, use_defaults=True, namespaces=None)
Validates an XML data against the XSD schema/component instance.
Parameters
• source – the source of XML data. For a schema can be a path to a file or an URI of a
resource or an opened file-like object or an Element Tree instance or a string containing
XML data. For other XSD components can be a string for an attribute or a simple type
validators, or an ElementTree’s Element otherwise.
• use_defaults – indicates whether to use default values for filling missing data.
• namespaces – is an optional mapping from namespace prefix to URI.
Raises XMLSchemaValidationError if XML data instance is not a valid.
decode(source, validation=’strict’, **kwargs)
Decodes XML data.
Parameters
• source – the XML data. Can be a string for an attribute or for a simple type components
or a dictionary for an attribute group or an ElementTree’s Element for other components.
• validation – the validation mode. Can be ‘lax’, ‘strict’ or ‘skip.
• kwargs – optional keyword arguments for the method iter_decode().
Returns a dictionary like object if the XSD component is an element, a group or a complex
type; a list if the XSD component is an attribute group; a simple data type object otherwise.
If validation argument is ‘lax’ a 2-items tuple is returned, where the first item is the decoded
object and the second item is a list containing the errors.
Raises XMLSchemaValidationError if the object is not decodable by the XSD compo-
nent, or also if it’s invalid when validation='strict' is provided.
iter_decode(source, validation=’lax’, **kwargs)
Creates an iterator for decoding an XML source to a Python object.
Parameters
• source – the XML data source.
• validation – the validation mode. Can be ‘lax’, ‘strict’ or ‘skip.
• kwargs – keyword arguments for the decoder API.
Variables
• min_occurs – the minOccurs property of the XSD particle. Defaults to 1.
• max_occurs – the maxOccurs property of the XSD particle. Defaults to 1, a None value
means ‘unbounded’.
is_empty()
Tests if max_occurs == 0. A zero-length model group is considered empty.
is_emptiable()
Tests if max_occurs == 0. A zero-length model group is considered emptiable. For model groups the test
outcome depends also on nested particles.
is_single()
Tests if the particle has max_occurs == 1. For elements the test outcome depends also on parent group.
For model groups the test outcome depends also on nested model groups.
is_multiple()
Tests the particle can have multiple occurrences.
is_ambiguous()
Tests if min_occurs != max_occurs.
is_univocal()
Tests if min_occurs == max_occurs.
is_missing(occurs: int)
Tests if provided occurrences are under the minimum.
is_over(occurs: int)
Tests if provided occurrences are over the maximum.
Variables qualified (bool) – for name matching, unqualified matching may be admitted only
for elements and attributes.
target_namespace
Property that references to schema’s targetNamespace.
local_name
The local part of the name of the component, or None if the name is None.
qualified_name
The name of the component in extended format, or None if the name is None.
prefixed_name
The name of the component in prefixed format, or None if the name is None.
is_global()
Returns True if the instance is a global component, False if it’s local.
is_matching(name, default_namespace=None, **kwargs)
Returns True if the component name is matching the name provided as argument, False otherwise. For
XSD elements the matching is extended to substitutes.
Parameters
• name – a local or fully-qualified name.
• default_namespace – used if it’s not None and not empty for completing the name
argument in case it’s a local name.
• kwargs – additional options that can be used by certain components.
tostring(indent=”, max_lines=None, spaces_for_tab=4)
Serializes the XML elements that declare or define the component to a string.
class XsdType(elem, schema, parent=None, name: Optional[str] = None)
Common base class for XSD types.
simple_type
Property that is the instance itself for a simpleType. For a complexType is the instance’s content if this is
a simpleType or None if the instance’s content is a model group.
model_group
Property that is None for a simpleType. For a complexType is the instance’s content if this is a model
group or None if the instance’s content is a simpleType.
has_complex_content()
Returns True if the instance is a complexType with mixed or element-only content, False otherwise.
has_mixed_content()
Returns True if the instance is a complexType with mixed content, False otherwise.
has_simple_content()
Returns True if the instance has a simple content, False otherwise.
static is_atomic()
Returns True if the instance is an atomic simpleType, False otherwise.
static is_complex()
Returns True if the instance is a complexType, False otherwise.
static is_datetime()
Returns True if the instance is a datetime/duration XSD builtin-type, False otherwise.
is_element_only()
Returns True if the instance is a complexType with element-only content, False otherwise.
is_emptiable()
Returns True if the instance has an emptiable value or content, False otherwise.
is_empty()
Returns True if the instance has an empty content, False otherwise.
static is_list()
Returns True if the instance is a list simpleType, False otherwise.
static is_simple()
Returns True if the instance is a simpleType, False otherwise.
class XsdElement(elem, schema, parent)
Class for XSD 1.0 element declarations.
Variables
• type – the XSD simpleType or complexType of the element.
• attributes – the group of the attributes associated with the element.
class XsdAttribute(elem, schema, parent=None, name: Optional[str] = None)
Class for XSD 1.0 attribute declarations.
Variables type – the XSD simpleType of the attribute.
A.12.2 Types
enumeration
max_value
min_value
class XsdAtomicBuiltin(elem, schema, name, python_type, base_type=None, admitted_facets=None,
facets=None, to_python=None, from_python=None)
Class for defining XML Schema built-in simpleType atomic datatypes. An instance contains a Python’s type
transformation and a list of validator functions. The ‘base_type’ is not used for validation, but only for reference
to the XML Schema restriction hierarchy.
Type conversion methods:
• to_python(value): Decoding from XML
• from_python(value): Encoding to XML
class XsdList(elem, schema, parent, name=None)
Class for ‘list’ definitions. A list definition has an item_type attribute that refers to an atomic or union simple-
Type definition.
class Xsd11Union(elem, schema, parent, name=None)
class XsdUnion(elem, schema, parent, name=None)
Class for ‘union’ definitions. A union definition has a member_types attribute that refers to a ‘simpleType’
definition.
class Xsd11AtomicRestriction(elem, schema, parent, name=None, facets=None,
base_type=None)
Class for XSD 1.1 atomic simpleType and complexType’s simpleContent restrictions.
class XsdAtomicRestriction(elem, schema, parent, name=None, facets=None, base_type=None)
Class for XSD 1.0 atomic simpleType and complexType’s simpleContent restrictions.
A.12.4 Wildcards
A.12.6 Facets
A.12.7 Others
map_type(obj)
Maps an XSD type to a type declaration of the target language. This method is registered as filter with a
name dependant from the language name (eg. c_type).
Parameters obj – an XSD type or another type-related declaration as an attribute or an element.
Returns an empty string for non-XSD objects.
list_templates(extensions=None, filter_func=None)
matching_templates(name)
get_template(name, parent=None, global_vars=None)
select_template(names, parent=None, global_vars=None)
render(names, parent=None, global_vars=None)
render_to_files(names, parent=None, global_vars=None, output_dir=’.’, force=False)
61
xmlschema Documentation, Release 1.5.0
H J
has_complex_content() (XsdType method), 55 JsonMLConverter (class in xmlschema), 46
has_mixed_content() (XsdType method), 55
has_simple_content() (XsdType method), 55 L
lazy_depth (XMLResource attribute), 48
I list_templates() (AbstractGenerator method), 60
id (XMLSchemaBase attribute), 36 load() (XMLResource method), 48
import_schema() (XMLSchemaBase method), 38 local_name (XsdComponent attribute), 55
include_schema() (XMLSchemaBase method), 38 lookup() (XsdGlobals method), 42
is_ambiguous() (ParticleMixin method), 54 losslessly (XMLSchemaConverter attribute), 44
is_atomic() (XsdType static method), 55 lossy (XMLSchemaConverter attribute), 44
is_complex() (XsdType static method), 55
is_datetime() (XsdType static method), 55 M
is_element_only() (XsdType method), 55 map_attributes() (XMLSchemaConverter method),
is_emptiable() (ParticleMixin method), 54 44
is_emptiable() (XsdType method), 56 map_content() (XMLSchemaConverter method), 44
is_empty() (ParticleMixin method), 54 map_qname() (XMLSchemaConverter method), 45
is_empty() (XsdType method), 56 map_type() (AbstractGenerator method), 59
is_global() (XsdComponent method), 55 matching_templates() (AbstractGenerator
is_lazy() (XMLResource method), 48 method), 60
is_list() (XsdType static method), 56 max_value (XsdSimpleType attribute), 57
is_loaded() (XMLResource method), 49 messages (Wsdl11Document attribute), 60
is_local() (XMLResource method), 48 min_value (XsdSimpleType attribute), 57
is_matching() (XsdComponent method), 55 model_group (XsdType attribute), 55
is_missing() (ParticleMixin method), 54
is_multiple() (ParticleMixin method), 54 N
is_over() (ParticleMixin method), 54 namespace (XMLResource attribute), 48
is_remote() (XMLResource method), 48 no_namespace_schema_location
is_simple() (XsdType static method), 56 (XMLSchemaBase attribute), 37
is_single() (ParticleMixin method), 54 normalize_url() (in module xmlschema), 47
is_univocal() (ParticleMixin method), 54
is_valid() (in module xmlschema), 32 O
is_valid() (ValidationMixin method), 52
open() (XMLResource method), 48
is_valid() (XMLSchemaBase method), 40
iter() (ElementPathMixin method), 51 P
iter() (XMLResource method), 49
ParkerConverter (class in xmlschema), 45
iter_components() (XMLSchemaBase method), 38
parse() (XMLResource method), 48
iter_decode() (ValidationMixin method), 52
ParticleMixin (class in xmlschema.validators), 54
iter_decode() (XMLSchemaBase method), 40
port_types (Wsdl11Document attribute), 60
iter_depth() (XMLResource method), 49
prefixed_name (XsdComponent attribute), 55
iter_encode() (ValidationMixin method), 53
iter_encode() (XMLSchemaBase method), 41
iter_errors() (in module xmlschema), 32
Q
iter_errors() (ValidationMixin method), 53 qualified_name (XsdComponent attribute), 55
iter_errors() (XMLSchemaBase method), 40
iter_globals() (XMLSchemaBase method), 38 R
iter_globals() (XsdGlobals method), 42 register() (XsdGlobals method), 42
iter_location_hints() (XMLResource method), render() (AbstractGenerator method), 60
49 render_to_files() (AbstractGenerator method),
iter_schemas() (XsdGlobals method), 42 60
iterchildren() (ElementPathMixin method), 51 resolve_qname() (XMLSchemaBase method), 38
iterfind() (ElementPathMixin method), 51 root (XMLResource attribute), 48
iterfind() (XMLResource method), 49 root (XMLSchemaBase attribute), 36
root_elements (XMLSchemaBase attribute), 37
62 Index
xmlschema Documentation, Release 1.5.0
S XMLSchemaException, 29
schema_location (XMLSchemaBase attribute), 37 XMLSchemaImportWarning, 31
select_template() (AbstractGenerator method), XMLSchemaIncludeWarning, 31
60 XMLSchemaModelDepthError, 30
services (Wsdl11Document attribute), 60 XMLSchemaModelError, 30
simple_type (XsdType attribute), 55 XMLSchemaNamespaceError, 29
simple_types (XMLSchemaBase attribute), 37 XMLSchemaNotBuiltError, 29
XMLSchemaParseError, 29
T XMLSchemaTypeTableWarning, 31
XMLSchemaValidationError, 30
tag (ElementPathMixin attribute), 51
XMLSchemaValidatorError, 29
tag (XMLSchemaBase attribute), 36
Xsd11AnyAttribute (class in
target_namespace (XsdComponent attribute), 55
xmlschema.validators), 57
target_prefix (XMLSchemaBase attribute), 37
Xsd11AnyElement (class in xmlschema.validators),
text (XMLResource attribute), 48
57
to_dict() (in module xmlschema), 32
Xsd11AtomicRestriction (class in
to_json() (in module xmlschema), 33
xmlschema.validators), 57
tostring() (XMLResource method), 48
Xsd11Attribute (class in xmlschema.validators), 56
tostring() (XsdComponent method), 55
Xsd11ComplexType (class in xmlschema.validators),
U 56
Xsd11Element (class in xmlschema.validators), 56
unbuilt (XsdGlobals attribute), 42 Xsd11Group (class in xmlschema.validators), 57
unmap_qname() (XMLSchemaConverter method), 45 Xsd11Key (class in xmlschema.validators), 58
UnorderedConverter (class in xmlschema), 45 Xsd11Keyref (class in xmlschema.validators), 58
url (XMLResource attribute), 48 Xsd11Union (class in xmlschema.validators), 57
url (XMLSchemaBase attribute), 36 Xsd11Unique (class in xmlschema.validators), 58
XsdAlternative (class in xmlschema.validators), 59
V XsdAnnotation (class in xmlschema.validators), 59
validate() (in module xmlschema), 31 XsdAnyAttribute (class in xmlschema.validators),
validate() (ValidationMixin method), 52 57
validate() (XMLSchemaBase method), 39 XsdAnyElement (class in xmlschema.validators), 57
validation_attempted (XMLSchemaBase at- XsdAssert (class in xmlschema.validators), 59
tribute), 39 XsdAssertionFacet (class in
ValidationMixin (class in xmlschema.validators), xmlschema.validators), 59
51 XsdAtomicBuiltin (class in xmlschema.validators),
validity (XMLSchemaBase attribute), 39 57
version (XMLSchemaBase attribute), 36 XsdAtomicRestriction (class in
xmlschema.validators), 57
W XsdAttribute (class in xmlschema), 56
Wsdl11Document (class in xmlschema.extras.wsdl), XsdAttributeGroup (class in
60 xmlschema.validators), 57
XsdComplexType (class in xmlschema.validators), 56
X XsdComponent (class in xmlschema), 54
XmlDocument (class in xmlschema), 50 XsdDefaultOpenContent (class in
XMLResource (class in xmlschema), 47 xmlschema.validators), 58
XMLResourceError, 29 XsdElement (class in xmlschema), 56
XMLSchema (in module xmlschema), 34 XsdEnumerationFacets (class in
xmlschema.XMLSchema10 (built-in class), 34 xmlschema.validators), 59
xmlschema.XMLSchema11 (built-in class), 34 XsdExplicitTimezoneFacet (class in
XMLSchemaBase (class in xmlschema), 34 xmlschema.validators), 59
XMLSchemaChildrenValidationError, 31 XsdFacet (class in xmlschema.validators), 58
XMLSchemaConverter (class in xmlschema), 42 XsdFieldSelector (class in xmlschema.validators),
XMLSchemaDecodeError, 30 58
XMLSchemaEncodeError, 30
Index 63
xmlschema Documentation, Release 1.5.0
XsdFractionDigitsFacet (class in
xmlschema.validators), 59
XsdGlobals (class in xmlschema), 41
XsdGroup (class in xmlschema.validators), 57
XsdIdentity (class in xmlschema.validators), 58
XsdKey (class in xmlschema.validators), 58
XsdKeyref (class in xmlschema.validators), 58
XsdLengthFacet (class in xmlschema.validators), 58
XsdList (class in xmlschema.validators), 57
XsdMaxExclusiveFacet (class in
xmlschema.validators), 59
XsdMaxInclusiveFacet (class in
xmlschema.validators), 58
XsdMaxLengthFacet (class in
xmlschema.validators), 58
XsdMinExclusiveFacet (class in
xmlschema.validators), 58
XsdMinInclusiveFacet (class in
xmlschema.validators), 58
XsdMinLengthFacet (class in
xmlschema.validators), 58
XsdNotation (class in xmlschema.validators), 59
XsdOpenContent (class in xmlschema.validators), 57
XsdPatternFacets (class in xmlschema.validators),
59
XsdSelector (class in xmlschema.validators), 58
XsdSimpleType (class in xmlschema.validators), 56
XsdTotalDigitsFacet (class in
xmlschema.validators), 59
XsdType (class in xmlschema), 55
XsdUnion (class in xmlschema.validators), 57
XsdUnique (class in xmlschema.validators), 58
XsdWhiteSpaceFacet (class in
xmlschema.validators), 58
64 Index