Copyright ©2001 W3C ® ( MIT , INRIA , Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
XML Schema Part 0: Primer is a non-normative document intended to provide an easily readable description of the XML Schema facilities, and is oriented towards quickly understanding how to create schemas using the XML Schema language. XML Schema Part 1: Structures and XML Schema Part 2: Datatypes provide the complete normative description of the XML Schema language. This primer describes the language features through numerous examples which are complemented by extensive references to the normative texts.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C.
This document has been reviewed by W3C Members and other interested parties and has been endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited as a normative reference from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.
This document has been produced by the W3C XML Schema Working Group as part of the W3C XML Activity. The goals of the XML Schema language are discussed in the XML Schema Requirements document. The authors of this document are the members of the XML Schema Working Group. Different parts of the document have different editors.
This version of this document incorporates some editorial changes from earlier versions.
Please report errors in this document to www-xml-schema-comments@w3.org ( archive). The list of known errors in this specification is available at http://www.w3.org/2001/05/xmlschema-errata.
The English version of this specification is the only normative version. Information about translations of this document is available at http://www.w3.org/2001/05/xmlschema-translations.
A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR.
This document, XML Schema Part 0: Primer, provides an easily approachable description of the XML Schema definition language, and should be used alongside the formal descriptions of the language contained in Parts 1 and 2 of the XML Schema specification. The intended audience of this document includes application developers whose programs read and write schema documents, and schema authors who need to know about the features of the language, especially features that provide functionality above and beyond what is provided by DTDs. The text assumes that you have a basic understanding of XML 1.0 and XML-Namespaces. Each major section of the primer introduces new features of the language, and describes those features in the context of concrete examples.
Section 2 covers the basic mechanisms of XML Schema. It describes how to declare the elements and attributes that appear in XML documents, the distinctions between simple and complex types, defining complex types, the use of simple types for element and attribute values, schema annotation, a simple mechanism for re-using element and attribute definitions, and nil values.
Section 3, the first advanced section in the primer, explains the basics of how namespaces are used in XML and schema documents. This section is important for understanding many of the topics that appear in the other advanced sections.
Section 4, the second advanced section in the primer, describes mechanisms for deriving types from existing types, and for controlling these derivations. The section also describes mechanisms for merging together fragments of a schema from multiple sources, and for element substitution.
Section 5 covers more advanced features, including a mechanism for specifying uniqueness among attributes and elements, a mechanism for using types across namespaces, a mechanism for extending types based on namespaces, and a description of how documents are checked for conformance.
In addition to the sections just described, the primer contains a number of appendices that provide detailed reference information on simple types and a regular expression language.
The primer is a non-normative document, which means that it does not provide a definitive (from the W3C's point of view) specification of the XML Schema language. The examples and other explanatory material in this document are provided to help you understand XML Schema, but they may not always provide definitive answers. In such cases, you will need to refer to the XML Schema specification, and to help you do this, we provide many links pointing to the relevant parts of the specification. More specifically, XML Schema items mentioned in the primer text are linked to an index of element names and attributes, and a summary table of datatypes, both in the primer. The table and the index contain links to the relevant sections of XML Schema parts 1 and 2.
The purpose of a schema is to define a class of XML documents, and so the term "instance document" is often used to describe an XML document that conforms to a particular schema. In fact, neither instances nor schemas need to exist as documents per se -- they may exist as streams of bytes sent between applications, as fields in a database record, or as collections of XML Infoset "Information Items" -- but to simplify the primer, we have chosen to always refer to instances and schemas as if they are documents and files.
Let us start by considering an instance document in a file
called
po.xml
. It describes
a purchase order generated by a home products ordering and
billing application:
<?xml version="1.0"?>
<purchaseOrder orderDate="1999-10-20">
<shipTo country="US">
<name>Alice Smith</name>
<street>123 Maple Street</street>
<city>Mill Valley</city>
<state>CA</state>
<zip>90952</zip>
</shipTo>
<billTo country="US">
<name>Robert Smith</name>
<street>8 Oak Avenue</street>
<city>Old Town</city>
<state>PA</state>
<zip>95819</zip>
</billTo>
<comment>Hurry, my lawn is going wild!</comment>
<items>
<item partNum="872-AA">
<productName>Lawnmower</productName>
<quantity>1</quantity>
<USPrice>148.95</USPrice>
<comment>Confirm this is electric</comment>
</item>
<item partNum="926-AA">
<productName>Baby Monitor</productName>
<quantity>1</quantity>
<USPrice>39.98</USPrice>
<shipDate>1999-05-21</shipDate>
</item>
</items>
</purchaseOrder>
The purchase order consists of a main element,
purchaseOrder, and the subelements
shipTo, billTo, comment,
and items. These subelements (except
comment) in turn contain other subelements, and so
on, until a subelement such as USPrice contains a
number rather than any subelements. Elements that contain
subelements or carry attributes are said to have complex types,
whereas elements that contain numbers (and strings, and dates,
etc.) but do not contain any subelements are said to have
simple types. Some elements have attributes; attributes always
have simple types.
The complex types in the instance document, and some of the simple types, are defined in the schema for purchase orders. The other simple types are defined as part of XML Schema's repertoire of built-in simple types.
Before going on to examine the purchase order schema, we digress briefly to mention the association between the instance document and the purchase order schema. As you can see by inspecting the instance document, the purchase order schema is not mentioned. An instance is not actually required to reference a schema, and although many will, we have chosen to keep this first section simple, and to assume that any processor of the instance document can obtain the purchase order schema without any information from the instance document. In later sections, we will introduce explicit mechanisms for associating instances and schemas.
The purchase order schema is contained in the file
po.xsd
:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:annotation>
<xsd:documentation xml:lang="en">
Purchase order schema for Example.com.
Copyright 2000 Example.com. All rights reserved.
</xsd:documentation>
</xsd:annotation>
<xsd:element name="purchaseOrder" type="PurchaseOrderType"/>
<xsd:element name="comment" type="xsd:string"/>
<xsd:complexType name="PurchaseOrderType">
<xsd:sequence>
<xsd:element name="shipTo" type="USAddress"/>
<xsd:element name="billTo" type="USAddress"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="items" type="Items"/>
</xsd:sequence>
<xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>
<xsd:complexType name="USAddress">
<xsd:sequence>
<xsd:element name="name" type="xsd:string"/>
<xsd:element name="street" type="xsd:string"/>
<xsd:element name="city" type="xsd:string"/>
<xsd:element name="state" type="xsd:string"/>
<xsd:element name="zip" type="xsd:decimal"/>
</xsd:sequence>
<xsd:attribute name="country" type="xsd:NMTOKEN"
fixed="US"/>
</xsd:complexType>
<xsd:complexType name="Items">
<xsd:sequence>
<xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="productName" type="xsd:string"/>
<xsd:element name="quantity">
<xsd:simpleType>
<xsd:restriction base="xsd:positiveInteger">
<xsd:maxExclusive value="100"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element name="USPrice" type="xsd:decimal"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="partNum" type="SKU" use="required"/>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
<!-- Stock Keeping Unit, a code for identifying products -->
<xsd:simpleType name="SKU">
<xsd:restriction base="xsd:string">
<xsd:pattern value="\d{3}-[A-Z]{2}"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:schema>
The purchase order schema consists of a
schema
element and
a variety of subelements, most notably
element
,
complexType
,
and
simpleType
which determine the appearance of elements and their content
in instance documents.
Each of the elements in the
schema has a prefix xsd: which is associated
with the XML Schema namespace through the declaration,
xmlns:xsd="http://www.w3.org/2001/XMLSchema",
that appears in the
schema
element.
The prefix xsd: is used by convention to denote
the XML Schema namespace, although any prefix can be used.
The same prefix, and hence the same association, also appears
on the names of built-in simple types, e.g.
xsd:string
. The purpose of
the association is to identify the elements and simple types
as belonging to the vocabulary of the XML Schema language
rather than the vocabulary of the schema author. For the sake
of clarity in the text, we just mention the names of elements
and simple types (e.g.
simpleType
),
and omit the prefix.
In XML Schema, there is a basic difference between complex types which allow elements in their content and may carry attributes, and simple types which cannot have element content and cannot carry attributes. There is also a major distinction between definitions which create new types (both simple and complex), and declarations which enable elements and attributes with specific names and types (both simple and complex) to appear in document instances. In this section, we focus on defining complex types and declaring the elements and attributes that appear within them.
New complex types are
defined using the
complexType
element and such definitions typically contain a set of
element declarations, element references, and attribute
declarations. The declarations are not themselves types, but
rather an association between a name and the constraints
which govern the appearance of that name in documents
governed by the associated schema. Elements are declared
using the
element
element, and attributes are declared using the
attribute
element. For example, USAddress is defined as a
complex type, and within the definition of
USAddress we see five element declarations and
one attribute declaration:
<xsd:complexType name="USAddress" > <xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="street" type="xsd:string"/> <xsd:element name="city" type="xsd:string"/> <xsd:element name="state" type="xsd:string"/> <xsd:element name="zip" type="xsd:decimal"/> </xsd:sequence> <xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/> </xsd:complexType>
The consequence of this
definition is that any element appearing in an instance whose
type is declared to be USAddress (e.g.
shipTo in
po.xml
) must consist of
five elements and one attribute. These elements must be
called name, street,
city, state and zip as
specified by the values of the declarations'
name attributes, and the elements must appear in
the same sequence (order) in which they are declared. The
first four of these elements will each contain a string, and
the fifth will contain a number. The element whose type is
declared to be USAddress may appear with an
attribute called country which must contain the
string US.
The USAddress
definition contains only declarations involving the simple
types:
string
,
decimal
and
NMTOKEN
. In contrast, the
PurchaseOrderType definition contains element
declarations involving complex types, e.g.
USAddress, although note that both declarations
use the same
type
attribute to identify the type, regardless of whether the
type is simple or complex.
<xsd:complexType name="PurchaseOrderType"> <xsd:sequence> <xsd:element name="shipTo" type="USAddress"/> <xsd:element name="billTo" type="USAddress"/> <xsd:element ref="comment" minOccurs="0"/> <xsd:element name="items" type="Items"/> </xsd:sequence> <xsd:attribute name="orderDate" type="xsd:date"/> </xsd:complexType>
In defining PurchaseOrderType, two of the
element declarations, for shipTo and
billTo, associate different element names with
the same complex type, namely USAddress. The
consequence of this definition is that any element appearing
in an instance document (e.g.
po.xml
) whose type is
declared to be PurchaseOrderType must consist of
elements named shipTo and billTo,
each containing the five subelements (name,
street, city, state
and zip) that were declared as part of
USAddress. The shipTo and
billTo elements may also carry the
country attribute that was declared as part of
USAddress.
The PurchaseOrderType definition contains an
orderDate attribute declaration which, like the
country attribute declaration, identifies a
simple type. In fact, all attribute declarations must
reference simple types because, unlike element declarations,
attributes cannot contain other elements or other
attributes.
The element declarations we have described so far have each associated a name with an existing type definition. Sometimes it is preferable to use an existing element rather than declare a new element, for example:
<xsd:element ref="comment" minOccurs="0"/>
This declaration references an existing element,
comment, that was declared elsewhere in the
purchase order schema. In general, the value of the
ref
attribute must
reference a global element, i.e. one that has been declared
under
schema
rather than as part of a complex type definition. The
consequence of this declaration is that an element called
comment may appear in an instance document, and
its content must be consistent with that element's type, in
this case,
string
.
The comment
element is optional within PurchaseOrderType
because the value of the
minOccurs
attribute in its declaration is 0. In general, an element
is required to appear when the value of
minOccurs
is 1 or more. The maximum number of times an element may
appear is determined by the value of a
maxOccurs
attribute in its declaration. This value may be a positive
integer such as 41, or the term unbounded to
indicate there is no maximum number of occurrences. The
default value for both the
minOccurs
and the
maxOccurs
attributes is 1. Thus, when an element such as
comment is declared without a
maxOccurs
attribute, the element may not occur more than once. Be
sure that if you specify a value for only the
minOccurs
attribute, it is less than or equal to the default value of
maxOccurs
,
i.e. it is 0 or 1. Similarly, if you specify a value for
only the
maxOccurs
attribute, it must be greater than or equal to the default
value of
minOccurs
,
i.e. 1 or more. If both attributes are omitted, the element
must appear exactly once.
Attributes may appear
once or not at all, but no other number of times, and so
the syntax for specifying occurrences of attributes is
different than the syntax for elements. In particular,
attributes can be declared with a
use
attribute to
indicate whether the attribute is required
(see for example, the partNum attribute
declaration in
po.xsd
),
optional, or even prohibited.
Default values of both attributes and elements are
declared using the default attribute, although
this attribute has a slightly different consequence in each
case. When an attribute is declared with a default value,
the value of the attribute is whatever value appears as the
attribute's value in an instance document; if the attribute
does not appear in the instance document, the schema
processor provides the attribute with a value equal to that
of the
default
attribute. Note that default values for attributes only
make sense if the attributes themselves are optional, and
so it is an error to specify both a default value and
anything other than a value of optional for
use
.
The schema processor treats defaulted elements slightly
differently. When an element is declared with a default
value, the value of the element is whatever value appears
as the element's content in the instance document; if the
element appears without any content, the schema processor
provides the element with a value equal to that of the
default
attribute. However, if the element does not appear in the
instance document, the schema processor does not provide
the element at all. In summary, the differences between
element and attribute defaults can be stated as: Default
attribute values apply when attributes are missing, and
default element values apply when elements are empty.
The fixed attribute is used in both
attribute and element declarations to ensure that the
attributes and elements are set to particular values. For
example,
po.xsd
contains
a declaration for the country attribute, which
is declared with a
fixed
value US. This declaration means that the
appearance of a country attribute in an
instance document is optional (the default value of
use
is
optional), although if the attribute does
appear, its value must be US, and if the
attribute does not appear, the schema processor will
provide a country attribute with the value
US. Note that the concepts of a fixed value
and a default value are mutually exclusive, and so it is an
error for a declaration to contain both fixed
and default attributes.
The values of the attributes used in element and attribute declarations to constrain their occurrences are summarized in Table 1.
| Table 1. Occurrence Constraints for Elements and Attributes | ||
|---|---|---|
| Elements (minOccurs, maxOccurs) fixed, default |
Attributes use, fixed, default |
Notes |
| (1, 1) -, - | required, -, - | element/attribute must appear once, it may have any value |
| (1, 1) 37, - | required, 37, - | element/attribute must appear once, its value must be 37 |
| (2, unbounded) 37, - | n/a | element must appear twice or more, its value must be 37; in general, minOccurs and maxOccurs values may be positive integers, and maxOccurs value may also be "unbounded" |
| (0, 1) -, - | optional, -, - | element/attribute may appear once, it may have any value |
| (0, 1) 37, - | optional, 37, - | element/attribute may appear once, if it does appear its value must be 37, if it does not appear its value is 37 |
| (0, 1) -, 37 | optional, -, 37 | element/attribute may appear once; if it does not appear its value is 37, otherwise its value is that given |
| (0, 2) -, 37 | n/a | element may appear once, twice, or not at all; if the element does not appear it is not provided; if it does appear and it is empty, its value is 37; otherwise its value is that given; in general, minOccurs and maxOccurs values may be positive integers, and maxOccurs value may also be "unbounded" |
| (0, 0) -, - | prohibited, -, - | element/attribute must not appear |
| Note that neither minOccurs, maxOccurs, nor use may appear in the declarations of global elements and attributes. | ||
Global elements, and global attributes, are created by
declarations that appear as the children of the
schema
element.
Once declared, a global element or a global attribute can
be referenced in one or more declarations using the
ref
attribute as
described above. A declaration that references a global
element enables the referenced element to appear in the
instance document in the context of the referencing
declaration. So, for example, the comment
element appears in
po.xml
at the same level
as the shipTo, billTo and
items elements because the declaration that
references comment appears in the complex type
definition at the same level as the declarations of the
other three elements.
The declaration of a global element also enables the
element to appear at the top-level of an instance document.
Hence purchaseOrder, which is declared as a
global element in
po.xsd
, can appear as
the top-level element in
po.xml
. Note that this
rationale will also allow a comment element to
appear as the top-level element in a document like
po.xml
.
There are a number of caveats concerning the use of
global elements and attributes. One caveat is that global
declarations cannot contain references; global declarations
must identify simple and complex types directly. Put
concretely, global declarations cannot contain the
ref
attribute,
they must use the
type
attribute
(or, as we describe shortly, be followed by an
anonymous type definition). A
second caveat is that cardinality constraints cannot be
placed on global declarations, although they can be placed
on local declarations that reference global declarations.
In other words, global declarations cannot contain the
attributes minOccurs,
maxOccurs, or
use.
We have now described how to define new complex types
(e.g. PurchaseOrderType), declare elements
(e.g. purchaseOrder) and declare attributes
(e.g. orderDate). These activities generally
involve naming, and so the question naturally arises: What
happens if we give two things the same name? The answer
depends upon the two things in question, although in
general the more similar are the two things, the more
likely there will be a conflict.
Here are some examples to illustrate when same names cause problems. If the two things are both types, say we define a complex type called USStates and a simple type called USStates, there is a conflict. If the two things are a type and an element or attribute, say we define a complex type called USAddress and we declare an element called USAddress, there is no conflict. If the two things are elements within different types (i.e. not global elements), say we declare one element called name as part of the USAddress type and a second element called name as part of the Item type, there is no conflict. (Such elements are sometimes called local element declarations.) Finally, if the two things are both types and you define one and XML Schema has defined the other, say you define a simple type called decimal, there is no conflict. The reason for the apparent contradiction in the last example is that the two types belong to different namespaces. We explore the use of namespaces in schema in a later section.
The purchase order schema declares several elements and
attributes that have simple types. Some of these simple
types, such as
string
and
decimal
, are built in to
XML Schema, while others are derived from the built-in's. For
example, the partNum attribute has a type called
SKU (Stock Keeping Unit) that is derived from
string
. Both built-in
simple types and their derivations can be used in all element
and attribute declarations. Table
2 lists all the simple types built in to XML Schema,
along with examples of the different types.
| Table 2. Simple Types Built In to XML Schema | ||
|---|---|---|
| Simple Type | Examples (delimited by commas) | Notes |
| string | Confirm this is electric | |
| normalizedString | Confirm this is electric | see (3) |
| token | Confirm this is electric | see (4) |
| byte | -1, 126 | see (2) |
| unsignedByte | 0, 126 | see (2) |
| base64Binary | GpM7 | |
| hexBinary | 0FB7 | |
| integer | -126789, -1, 0, 1, 126789 | see (2) |
| positiveInteger | 1, 126789 | see (2) |
| negativeInteger | -126789, -1 | see (2) |
| nonNegativeInteger | 0, 1, 126789 | see (2) |
| nonPositiveInteger | -126789, -1, 0 | see (2) |
| int | -1, 126789675 | see (2) |
| unsignedInt | 0, 1267896754 | see (2) |
| long | -1, 12678967543233 | see (2) |
| unsignedLong | 0, 12678967543233 | see (2) |
| short | -1, 12678 | see (2) |
| unsignedShort | 0, 12678 | see (2) |
| decimal | -1.23, 0, 123.4, 1000.00 | see (2) |
| float | -INF, -1E4, -0, 0, 12.78E-2, 12, INF, NaN | equivalent to single-precision 32-bit floating point, NaN is "not a number", see (2) |
| double | -INF, -1E4, -0, 0, 12.78E-2, 12, INF, NaN | equivalent to double-precision 64-bit floating point, see (2) |
| boolean | true, false 1, 0 |
|
| time | 13:20:00.000, 13:20:00.000-05:00 | see (2) |
| dateTime | 1999-05-31T13:20:00.000-05:00 | May 31st 1999 at 1.20pm Eastern Standard Time which is 5 hours behind Co-Ordinated Universal Time, see (2) |
| duration | P1Y2M3DT10H30M12.3S | 1 year, 2 months, 3 days, 10 hours, 30 minutes, and 12.3 seconds |
| date | 1999-05-31 | see (2) |
| gMonth | --05-- | May, see (2) (5) |
| gYear | 1999 | 1999, see (2) (5) |
| gYearMonth | 1999-02 | the month of February 1999, regardless of the number of days, see (2) (5) |
| gDay | ---31 | the 31st day, see (2) (5) |
| gMonthDay | --05-31 | every May 31st, see (2) (5) |
| Name | shipTo | XML 1.0 Name type |
| QName | po:USAddress | XML Namespace QName |
| NCName | USAddress | XML Namespace NCName, i.e. a QName without the prefix and colon |
| anyURI | http://www.example.com/, http://www.example.com/doc.html#ID5 | |
| language | en-GB, en-US, fr | valid values for xml:lang as defined in XML 1.0 |
| ID | XML 1.0 ID attribute type, see (1) | |
| IDREF | XML 1.0 IDREF attribute type, see (1) | |
| IDREFS | XML 1.0 IDREFS attribute type, see (1) | |
| ENTITY | XML 1.0 ENTITY attribute type, see (1) | |
| ENTITIES | XML 1.0 ENTITIES attribute type, see (1) | |
| NOTATION | XML 1.0 NOTATION attribute type, see (1) | |
| NMTOKEN | US, Brésil |
XML 1.0 NMTOKEN attribute type, see (1) |
| NMTOKENS | US UK, Brésil Canada Mexique |
XML 1.0 NMTOKENS attribute type, i.e. a whitespace separated list of NMTOKEN's, see (1) |
| Notes: (1) To retain compatibility between XML Schema and XML 1.0 DTDs, the simple types ID, IDREF, IDREFS, ENTITY, ENTITIES, NOTATION, NMTOKEN, NMTOKENS should only be used in attributes. (2) A value of this type can be represented by more than one lexical format, e.g. 100 and 1.0E2 are both valid float formats representing "one hundred". However, rules have been established for this type that define a canonical lexical format, see XML Schema Part 2. (3) Newline, tab and carriage-return characters in a normalizedString type are converted to space characters before schema processing. (4) As normalizedString, and adjacent space characters are collapsed to a single space character, and leading and trailing spaces are removed. (5) The "g" prefix signals time periods in the Gregorian calender. | ||
New simple types are defined
by deriving them from existing simple types (built-in's and
derived). In particular, we can derive a new simple type by
restricting an existing simple type, in other words, the
legal range of values for the new type are a subset of the
existing type's range of values. We use the
simpleType
element to define and name the new simple type. We use the
restriction
element to indicate the existing (base) type, and to identify
the "facets" that constrain the range of values. A complete
list of facets is provided in
Appendix B.
Suppose we wish to create a
new type of integer called myInteger whose range
of values is between 10000 and 99999 (inclusive). We base our
definition on the built-in simple type
integer
, whose range of
values also includes integers less than 10000 and greater
than 99999. To define myInteger, we restrict the
range of the
integer
base
type by employing two facets called
minInclusive
and
maxInclusive
:
<xsd:simpleType name="myInteger">
<xsd:restriction base="xsd:integer">
<xsd:minInclusive value="10000"/>
<xsd:maxInclusive value="99999"/>
</xsd:restriction>
</xsd:simpleType>
The example shows one particular combination of a base
type and two facets used to define myInteger,
but a look at the list of built-in simple types and their
facets (Appendix B) should
suggest other viable combinations.
The purchase order schema
contains another, more elaborate, example of a simple type
definition. A new simple type called SKU is
derived (by restriction) from the simple type
string
. Furthermore, we
constrain the values of SKU using a facet called
pattern
in
conjunction with the regular expression
"\d{3}-[A-Z]{2}" that is read "three digits
followed by a hyphen followed by two upper-case ASCII
letters":
<xsd:simpleType name="SKU">
<xsd:restriction base="xsd:string">
<xsd:pattern value="\d{3}-[A-Z]{2}"/>
</xsd:restriction>
</xsd:simpleType>
This regular expression language is described more fully in Appendix D.
XML Schema defines fifteen
facets which are listed in
Appendix B. Among these, the
enumeration
facet is particularly useful and it can be used to constrain
the values of almost every simple type, except the
boolean
type. The
enumeration
facet limits a simple type to a set of distinct values. For
example, we can use the
enumeration
facet to define a new simple type called
USState, derived from
string
, whose value must
be one of the standard US state abbreviations:
<xsd:simpleType name="USState">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="AK"/>
<xsd:enumeration value="AL"/>
<xsd:enumeration value="AR"/>
<!-- and so on ... -->
</xsd:restriction>
</xsd:simpleType>
USState would be a good replacement for the
string
type currently used
in the state element declaration. By making this
replacement, the legal values of a state
element, i.e. the state subelements of
billTo and shipTo, would be limited
to one of AK, AL, AR,
etc. Note that the enumeration values specified for a
particular type must be unique.
XML Schema has the concept of a list type, in addition
to the so-called atomic types that constitute most of the
types listed in Table 2.
(Atomic types, list types, and the union types described in
the next section are collectively called simple types.) The
value of an atomic type is indivisible from XML Schema's
perspective. For example, the
NMTOKEN
value
US is indivisible in the sense that no part of
US, such as the character "S", has any meaning
by itself. In contrast, list types are comprised of
sequences of atomic types and consequently the parts of a
sequence (the "atoms") themselves are meaningful. For
example,
NMTOKENS
is a
list type, and an element of this type would be a
white-space delimited list of
NMTOKEN
's, such as "US
UK FR". XML Schema has three built-in list types, they are
NMTOKENS
,
IDREFS
, and
ENTITIES
.
In addition to using the
built-in list types, you can create new list types by
derivation from existing atomic types. (You cannot create
list types from existing list types, nor from complex
types.) For example, to create a list of
myInteger's:
<xsd:simpleType name="listOfMyIntType"> <xsd:list itemType="myInteger"/> </xsd:simpleType>
And an element in an instance document whose content
conforms to listOfMyIntType is:
<listOfMyInt>20003 15037 95977 95945</listOfMyInt>
Several facets can be
applied to list types:
length
,
minLength
,
maxLength
,
and
enumeration
.
For example, to define a list of exactly six US states
(SixUSStates), we first define a new list type
called USStateList from USState,
and then we derive SixUSStates by restricting
USStateList to only six items:
<xsd:simpleType name="USStateList"> <xsd:list itemType="USState"/> </xsd:simpleType> <xsd:simpleType name="SixUSStates"> <xsd:restriction base="USStateList"> <xsd:length value="6"/> </xsd:restriction> </xsd:simpleType>
Elements whose type is SixUSStates must
have six items, and each of the six items must be one of
the (atomic) values of the enumerated type
USState, for example:
<sixStates>PA NY CA NY LA AK</sixStates>
Note that it is possible to derive a list type from the
atomic type
string
.
However, a
string
may
contain white space, and white space delimits the items in
a list type, so you should be careful using list types
whose base type is
string
. For example,
suppose we have defined a list type with a
length
facet
equal to 3, and base type
string
, then the
following 3 item list is legal:
Asie Europe Afrique
But the following 3 "item" list is illegal:
Asie Europe Amérique Latine
Even though "Amérique Latine" may exist as a single string outside of the list, when it is included in the list, the whitespace between Amérique and Latine effectively creates a fourth item, and so the latter example will not conform to the 3-item list type.
Atomic types and list
types enable an element or an attribute value to be one or
more instances of one atomic type. In contrast, a union
type enables an element or attribute value to be one or
more instances of one type drawn from the union of multiple
atomic and list types. To illustrate, we create a union
type for representing American states as singleton letter
abbreviations or lists of numeric codes. The
zipUnion union type is built from one atomic
type and one list type:
<xsd:simpleType name="zipUnion"> <xsd:union memberTypes="USState listOfMyIntType"/> </xsd:simpleType>
When we define a union type, the
memberTypes attribute value is a list of all
the types in the union.
Now, assuming we have declared an element called
zips of type zipUnion, valid
instances of the element are:
<zips>CA</zips>
<zips>95630 95977 95945</zips>
<zips>AK</zips>
Two facets,
pattern
and
enumeration
,
can be applied to a union type.
Schemas can be constructed by defining sets of named types
such as PurchaseOrderType and then declaring
elements such as purchaseOrder that reference
the types using the
type=
construction. This style of schema construction is
straightforward but it can be unwieldy, especially if you
define many types that are referenced only once and contain
very few constraints. In these cases, a type can be more
succinctly defined as an anonymous type which saves the
overhead of having to be named and explicitly referenced.
The definition of the type Items in
po.xsd
contains two
element declarations that use anonymous types
(item and quantity). In general,
you can identify anonymous types by the lack of a
type= in an
element (or attribute) declaration, and by the presence of an
un-named (simple or complex) type definition:
<xsd:complexType name="Items">
<xsd:sequence>
<xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="productName" type="xsd:string"/>
<xsd:element name="quantity">
<xsd:simpleType>
<xsd:restriction base="xsd:positiveInteger">
<xsd:maxExclusive value="100"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element name="USPrice" type="xsd:decimal"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="partNum" type="SKU" use="required"/>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
In the case of the item element, it has an
anonymous complex type consisting of the elements
productName, quantity,
USPrice, comment, and
shipDate, and an attribute called
partNum. In the case of the
quantity element, it has an anonymous simple
type derived from
integer
whose value ranges between 1 and 99.
The purchase order schema has many examples of elements
containing other elements (e.g. items), elements
having attributes and containing other elements (e.g.
shipTo), and elements containing only a simple
type of value (e.g. USPrice). However, we have
not seen an element having attributes but containing only a
simple type of value, nor have we seen an element that
contains other elements mixed with character content, nor
have we seen an element that has no content at all. In this
section we'll examine these variations in the content models
of elements.
Let us first consider how to declare an element that has an attribute and contains a simple value. In an instance document, such an element might appear as:
<internationalPrice currency="EUR">423.46</internationalPrice>
The purchase order schema declares a
USPrice element that is a starting point:
<xsd:element name="USPrice" type="decimal"/>
Now, how do we add an
attribute to this element? As we have said before, simple
types cannot have attributes, and
decimal
is a simple
type. Therefore, we must define a complex type to carry the
attribute declaration. We also want the content to be
simple type
decimal
. So
our original question becomes: How do we define a complex
type that is based on the simple type
decimal
? The answer is
to derive a new complex type from the simple type
decimal
:
<xsd:element name="internationalPrice">
<xsd:complexType>
<xsd:simpleContent>
<xsd:extension base="xsd:decimal">
<xsd:attribute name="currency" type="xsd:string"/>
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>
</xsd:element>
We use the
complexType
element to start the definition of a new (anonymous) type.
To indicate that the content model of the new type contains
only character data and no elements, we use a
simpleContent
element. Finally, we derive the new type by extending the
simple
decimal
type.
The extension consists of adding a currency
attribute using a standard attribute declaration. (We cover
type derivation in detail in Section 4.)
The internationalPrice element declared in
this way will appear in an instance as shown in the example
at the beginning of this section.
The construction of the purchase order schema may be characterized as elements containing subelements, and the deepest subelements contain character data. XML Schema also provides for the construction of schemas where character data can appear alongside subelements, and character data is not confined to the deepest subelements.
To illustrate, consider the following snippet from a customer letter that uses some of the same elements as the purchase order:
<letterBody> <salutation>Dear Mr.<name>Robert Smith</name>.</salutation> Your order of <quantity>1</quantity> <productName>Baby Monitor</productName> shipped from our warehouse on <shipDate>1999-05-21</shipDate>. .... </letterBody>
Notice the text appearing between elements and their
child elements. Specifically, text appears between the
elements salutation, quantity,
productName and shipDate which
are all children of letterBody, and text
appears around the element name which is the child of a
child of letterBody. The following snippet of
a schema declares letterBody:
<xsd:element name="letterBody">
<xsd:complexType mixed="true">
<xsd:sequence>
<xsd:element name="salutation">
<xsd:complexType mixed="true">
<xsd:sequence>
<xsd:element name="name" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element name="quantity" type="xsd:positiveInteger"/>
<xsd:element name="productName" type="xsd:string"/>
<xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
<!-- etc. -->
</xsd:sequence>
</xsd:complexType>
</xsd:element>
The elements appearing in the customer letter are
declared, and their types are defined using the
element
and
complexType
element constructions we have seen before. To enable
character data to appear between the child-elements of
letterBody, the
mixed
attribute
on the type definition is set to true.
Note that the mixed model in XML Schema
differs fundamentally from the
mixed model in XML 1.0. Under the XML
Schema mixed model, the order and number of child elements
appearing in an instance must agree with the order and
number of child elements specified in the model. In
contrast, under the XML 1.0 mixed model, the order and
number of child elements appearing in an instance cannot be
constrained. In summary, XML Schema provides full
validation of mixed models in contrast to the partial
schema validation provided by XML 1.0.
Now suppose that we want the
internationalPrice element to convey both the
unit of currency and the price as attribute values rather
than as separate attribute and content values. For
example:
<internationalPrice currency="EUR" value="423.46"/>
Such an element has no content at all; its content model is empty. To define a type whose content is empty, we essentially define a type that allows only elements in its content, but we do not actually declare any elements and so the type's content model is empty:
<xsd:element name="internationalPrice">
<xsd:complexType>
<xsd:complexContent>
<xsd:restriction base="xsd:anyType">
<xsd:attribute name="currency" type="xsd:string"/>
<xsd:attribute name="value" type="xsd:decimal"/>
</xsd:restriction>
</xsd:complexContent>
</xsd:complexType>
</xsd:element>
In this example, we define an (anonymous) type having
complexContent, i.e. only elements. The
complexContent element signals that we intend
to restrict or extend the content model of a complex type,
and the restriction of anyType
declares two attributes but does not introduce any element
content (see Section 4.4 for
more details on restriction). The
internationalPrice element declared in this
way may legitimately appear in an instance as shown in the
example above.
The preceding syntax for an empty-content element is
relatively verbose, and it is possible to declare the
internationalPrice element more compactly:
<xsd:element name="internationalPrice"> <xsd:complexType> <xsd:attribute name="currency" type="xsd:string"/> <xsd:attribute name="value" type="xsd:decimal"/> </xsd:complexType> </xsd:element>
This compact syntax works because a complex type defined
without any simpleContent or
complexContent is interpreted as shorthand for
complex content that restricts anyType.
The anyType represents an abstraction
called the
ur-type
which is the base type from which all
simple and complex types are derived. An
anyType type does not constrain its content in
any way. It is possible to use anyType like
other types, for example:
<xsd:element name="anything" type="xsd:anyType"/>
The content of the element declared in this way is
unconstrained, so the element value may be 423.46, but it
may be any other sequence of characters as well, or indeed
a mixture of characters and elements. In fact,
anyType is the default type when none is
specified, so the above could also be written as
follows:
<xsd:element name="anything"/>
If unconstrained element content is needed, for example
in the case of elements containing prose which requires
embedded markup to support internationalization, then the
default declaration or a slightly restricted form of it may
be suitable. The text type described in
Section 5.5 is an example of such a
type that is suitable for such purposes.
XML Schema provides three
elements for annotating schemas for the benefit of both human
readers and applications. In the purchase order schema, we
put a basic schema description and copyright information
inside the
documentation
element, which is the recommended location for human readable
material. We recommend you use the xml:lang
attribute with any
documentation
elements to indicate the language of the information.
Alternatively, you may indicate the language of all
information in a schema by placing an xml:lang
attribute on the schema element.
The
appInfo
element,
which we did not use in the purchase order schema, can be
used to provide information for tools, stylesheets and other
applications. An interesting example using
appInfo
is a
schema that describes the simple types in XML Schema Part
2: Datatypes. Information describing this schema, e.g. which
facets are applicable to particular simple types, is
represented inside
appInfo
elements,
and this information was used by an application to
automatically generate text for the XML Schema Part 2
document.
Both
documentation
and
appInfo
appear as subelements of
annotation
,
which may itself appear at the beginning of most schema
constructions. To illustrate, the following example shows
annotation
elements appearing at the beginning of an element declaration
and a complex type definition:
<xsd:element name="internationalPrice">
<xsd:annotation>
<xsd:documentation xml:lang="en">
element declared with anonymous type
</xsd:documentation>
</xsd:annotation>
<xsd:complexType>
<xsd:annotation>
<xsd:documentation xml:lang="en">
empty anonymous type with 2 attributes
</xsd:documentation>
</xsd:annotation>
<xsd:complexContent>
<xsd:restriction base="xsd:anyType">
<xsd:attribute name="currency" type="xsd:string"/>
<xsd:attribute name="value" type="xsd:decimal"/>
</xsd:restriction>
</xsd:complexContent>
</xsd:complexType>
</xsd:element>
The
annotation
element may also appear at the beginning of other schema
constructions such as those indicated by the elements
schema
,
simpleType
,
and
attribute
.
The definitions of complex types in the purchase order
schema all declare sequences of elements that must appear in
the instance document. The occurrence of individual elements
declared in the so-called content models of these types may
be optional, as indicated by a 0 value for the attribute
minOccurs
(e.g. in comment), or be otherwise constrained
depending upon the values of
minOccurs
and
maxOccurs
.
XML Schema also provides constraints that apply to groups of
elements appearing in a content model. These constraints
mirror those available in XML 1.0 plus some additional
constraints. Note that the constraints do not apply to
attributes.
XML Schema enables groups of elements to be defined and named, so that the elements can be used to build up the content models of complex types (thus mimicking common usage of parameter entities in XML 1.0). Un-named groups of elements can also be defined, and along with elements in named groups, they can be constrained to appear in the same order (sequence) as they are declared. Alternatively, they can be constrained so that only one of the elements may appear in an instance.
To illustrate, we
introduce two groups into the PurchaseOrderType
definition from the purchase order schema so that purchase
orders may contain either separate shipping and billing
addresses, or a single address for those cases in which the
shippee and billee are co-located:
<xsd:complexType name="PurchaseOrderType">
<xsd:sequence>
<xsd:choice>
<xsd:group ref="shipAndBill"/>
<xsd:element name="singleUSAddress" type="USAddress"/>
</xsd:choice>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="items" type="Items"/>
</xsd:sequence>
<xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>
<xsd:group name="shipAndBill">
<xsd:sequence>
<xsd:element name="shipTo" type="USAddress"/>
<xsd:element name="billTo" type="USAddress"/>
</xsd:sequence>
</xsd:group>
The
choice
group element allows only one of its children to appear in an
instance. One child is an inner
group
element that
references the named group shipAndBill
consisting of the element sequence shipTo,
billTo, and the second child is a
singleUSAddress. Hence, in an instance document,
the purchaseOrder element must contain either a
shipTo element followed by a billTo
element or a singleUSAddress element. The
choice
group is
followed by the comment and items
element declarations, and both the
choice
group and
the element declarations are children of a
sequence
group.
The effect of these various groups is that the address
element(s) must be followed by comment and
items elements in that order.
There exists a third
option for constraining elements in a group: All the elements
in the group may appear once or not at all, and they may
appear in any order. The
all
group (which
provides a simplified version of the SGML &-Connector) is
limited to the top-level of any content model. Moreover, the
group's children must all be individual elements (no groups),
and no element in the content model may appear more than
once, i.e. the permissible values of
minOccurs
and
maxOccurs
are
0 and 1. For example, to allow the child elements of
purchaseOrder to appear in any order, we could
redefine PurchaseOrderType as:
<xsd:complexType name="PurchaseOrderType">
<xsd:all>
<xsd:element name="shipTo" type="USAddress"/>
<xsd:element name="billTo" type="USAddress"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="items" type="Items"/>
</xsd:all>
<xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>
By this definition, a comment element may
optionally appear within purchaseOrder, and it
may appear before or after any shipTo,
billTo and items elements, but it
can appear only once. Moreover, the stipulations of an
all
group do not
allow us to declare an element such as comment
outside the group as a means of enabling it to appear more
than once. XML Schema stipulates that an
all
group must appear
as the sole child at the top of a content model. In other
words, the following is illegal:
<xsd:complexType name="PurchaseOrderType">
<xsd:sequence>
<xsd:all>
<xsd:element name="shipTo" type="USAddress"/>
<xsd:element name="billTo" type="USAddress"/>
<xsd:element name="items" type="Items"/>
</xsd:all>
<xsd:sequence>
<xsd:element ref="comment" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:sequence>
<xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>
Finally, named and un-named groups that appear in content
models (represented by
group
and
choice
,
sequence
,
all
respectively) may
carry
minOccurs
and
maxOccurs
attributes. By combining and nesting the various groups
provided by XML Schema, and by setting the values of
minOccurs
and
maxOccurs
, it
is possible to represent any content model expressible with
an XML 1.0 DTD. Furthermore, the
all
group provides
additional expressive power.
Suppose we want to provide more information about each
item in a purchase order, for example, each item's weight and
preferred shipping method. We can accomplish this by adding
weightKg and shipBy attribute
declarations to the item element's (anonymous)
type definition:
<xsd:element name="Item" minOccurs="0" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="productName" type="xsd:string"/>
<xsd:element name="quantity">
<xsd:simpleType>
<xsd:restriction base="xsd:positiveInteger">
<xsd:maxExclusive value="100"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element name="USPrice" type="xsd:decimal"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="partNum" type="SKU" use="required"/>
<!-- add weightKg and shipBy attributes -->
<xsd:attribute name="weightKg" type="xsd:decimal"/>
<xsd:attribute name="shipBy">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:enumeration value="air"/>
<xsd:enumeration value="land"/>
<xsd:enumeration value="any"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
</xsd:complexType>
</xsd:element>
Alternatively, we can
create a named attribute group containing all the desired
attributes of an item element, and reference
this group by name in the item element
declaration:
<xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="productName" type="xsd:string"/>
<xsd:element name="quantity">
<xsd:simpleType>
<xsd:restriction base="xsd:positiveInteger">
<xsd:maxExclusive value="100"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element name="USPrice" type="xsd:decimal"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
</xsd:sequence>
<!-- attributeGroup replaces individual declarations -->
<xsd:attributeGroup ref="ItemDelivery"/>
</xsd:complexType>
</xsd:element>
<xsd:attributeGroup name="ItemDelivery">
<xsd:attribute name="partNum" type="SKU" use="required"/>
<xsd:attribute name="weightKg" type="xsd:decimal"/>
<xsd:attribute name="shipBy">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:enumeration value="air"/>
<xsd:enumeration value="land"/>
<xsd:enumeration value="any"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
</xsd:attributeGroup>
Using an attribute group in this way can improve the readability of schemas, and facilitates updating schemas because an attribute group can be defined and edited in one place and referenced in multiple definitions and declarations. These characteristics of attribute groups make them similar to parameter entities in XML 1.0. Note that an attribute group may contain other attribute groups. Note also that both attribute declarations and attribute group references must appear at the end of complex type definitions.
One of the purchase order items listed in
po.xml
, the
Lawnmower, does not have a shipDate
element. Within the context of our scenario, the schema
author may have intended such absences to indicate
items not yet shipped. But in general, the
absence of an element does not have any particular meaning:
It may indicate that the information is unknown, or not
applicable, or the element may be absent for some other
reason. Sometimes it is desirable to represent an unshipped
item, unknown information, or inapplicable
information explicitly with an element, rather than
by an absent element. For example, it may be desirable to
represent a "null" value being sent to or from a relational
database with an element that is present. Such cases can be
represented using XML Schema's nil mechanism which enables an
element to appear with or without a non-nil value.
XML Schema's nil mechanism
involves an "out of band" nil signal. In other words, there
is no actual nil value that appears as element content,
instead there is an attribute to indicate that the element
content is nil. To illustrate, we modify the
shipDate element declaration so that nils can be
signalled:
<xsd:element name="shipDate" type="xsd:date" nillable="true"/>
And to explicitly
represent that shipDate has a nil value in the
instance document, we set the nil attribute (from the XML
Schema namespace for instances) to true:
<shipDate xsi:nil="true"></shipDate>
The
nil
attribute is defined as part of the XML Schema namespace for
instances,
http://www.w3.org/2001/XMLSchema-instance, and
so it must appear in the instance document with a prefix
(such as xsi:) associated with that namespace.
(As with the xsd: prefix, the xsi:
prefix is used by convention only.) Note that the nil
mechanism applies only to element values, and not to
attribute values. An element with
xsi:nil="true"
may not have any element content but it may still carry
attributes.
A schema can be viewed as a
collection (vocabulary) of type definitions and element
declarations whose names belong to a particular namespace
called a target namespace. Target namespaces enable us to
distinguish between definitions and declarations from different
vocabularies. For example, target namespaces would enable us to
distinguish between the declaration for
element
in the XML
Schema language vocabulary, and a declaration for
element in a hypothetical chemistry language
vocabulary. The former is part of the
http://www.w3.org/2001/XMLSchema target namespace,
and the latter is part of another target namespace.
When we want to check that an instance document conforms to one or more schemas (through a process called schema validation), we need to identify which element and attribute declarations and type definitions in the schemas should be used to check which elements and attributes in the instance document. The target namespace plays an important role in the identification process. We examine the role of the target namespace in the next section.
The schema author also has several options that affect how the identities of elements and attributes are represented in instance documents. More specifically, the author can decide whether or not the appearance of locally declared elements and attributes in an instance must be qualified by a namespace, using either an explicit prefix or implicitly by default. The schema author's choice regarding qualification of local elements and attributes has a number of implications regarding the structures of schemas and instance documents, and we examine some of these implications in the following sections.
In a new version of the purchase order schema,
po1.xsd
, we explicitly
declare a target namespace, and specify that both locally
defined elements and locally defined attributes must be
unqualified. The target namespace in
po1.xsd
is
http://www.example.com/PO1, as indicated by the
value of the
targetNamespace
attribute.
Qualification of local elements and attributes can be
globally specified by a pair of attributes,
elementFormDefault
and
attributeFormDefault
,
on the
schema
element, or can be specified separately for each local
declaration using the
form
attribute.
All such attributes' values may each be set to
unqualified or qualified, to
indicate whether or not locally declared elements and
attributes must be unqualified.
In
po1.xsd
we globally
specify the qualification of elements and attributes by
setting the values of both
elementFormDefault
and
attributeFormDefault
to unqualified. Strictly speaking, these
settings are unnecessary because the values are the defaults
for the two attributes; we make them here to highlight the
contrast between this case and other cases we describe
later.
<schema xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:po="http://www.example.com/PO1"
targetNamespace="http://www.example.com/PO1"
elementFormDefault="unqualified"
attributeFormDefault="unqualified">
<element name="purchaseOrder" type="po:PurchaseOrderType"/>
<element name="comment" type="string"/>
<complexType name="PurchaseOrderType">
<sequence>
<element name="shipTo" type="po:USAddress"/>
<element name="billTo" type="po:USAddress"/>
<element ref="po:comment" minOccurs="0"/>
<!-- etc. -->
</sequence>
<!-- etc. -->
</complexType>
<complexType name="USAddress">
<sequence>
<element name="name" type="string"/>
<element name="street" type="string"/>
<!-- etc. -->
</sequence>
</complexType>
<!-- etc. -->
</schema>
To see how the target namespace of this schema is
populated, we examine in turn each of the type definitions
and element declarations. Starting from the end of the
schema, we first define a type called USAddress
that consists of the elements name,
street, etc. One consequence of this type
definition is that the USAddress type is
included in the schema's target namespace. We next define a
type called PurchaseOrderType that consists of
the elements shipTo, billTo,
comment, etc. PurchaseOrderType is
also included in the schema's target namespace. Notice that
the type references in the three element declarations are
prefixed, i.e. po:USAddress,
po:USAddress and po:comment, and
the prefix is associated with the namespace
http://www.example.com/PO1. This is the same
namespace as the schema's target namespace, and so a
processor of this schema will know to look within this schema
for the definition of the type USAddress and the
declaration of the element comment. It is also
possible to refer to types in another schema with a different
target namespace, hence enabling re-use of definitions and
declarations between schemas.
At the beginning of the schema
po1.xsd
, we declare the
elements purchaseOrder and comment.
They are included in the schema's target namespace. The
purchaseOrder element's type is prefixed, for
the same reason that USAddress is prefixed. In
contrast, the comment element's type,
string
, is not prefixed.
The
po1.xsd
schema
contains a default namespace declaration, and so unprefixed
types such as
string
and
unprefixed elements such as
element
and
complexType
are associated with the default namespace
http://www.w3.org/2001/XMLSchema. In fact, this
is the target namespace of XML Schema itself, and so a
processor of
po1.xsd
will
know to look within the schema of XML Schema -- otherwise
known as the "schema for schemas" -- for the definition of
the type
string
and the
declaration of the element called
element
.
Let us now examine how the target namespace of the schema affects a conforming instance document:
<?xml version="1.0"?>
<apo:purchaseOrder xmlns:apo="http://www.example.com/PO1"
orderDate="1999-10-20">
<shipTo country="US">
<name>Alice Smith</name>
<street>123 Maple Street</street>
<!-- etc. -->
</shipTo>
<billTo country="US">
<name>Robert Smith</name>
<street>8 Oak Avenue</street>
<!-- etc. -->
</billTo>
<apo:comment>Hurry, my lawn is going wild!</apo:comment>
<!-- etc. -->
</apo:purchaseOrder>
The instance document declares one namespace,
http://www.example.com/PO1, and associates it
with the prefix apo:. This prefix is used to
qualify two elements in the document, namely
purchaseOrder and comment. The
namespace is the same as the target namespace of the schema
in
po1.xsd
, and so a
processor of the instance document will know to look in that
schema for the declarations of purchaseOrder and
comment. In fact, target namespaces are so named
because of the sense in which there exists a target namespace
for the elements purchaseOrder and
comment. Target namespaces in the schema
therefore control the validation of corresponding namespaces
in the instance.
The prefix apo: is applied to the global
elements purchaseOrder and comment
elements. Furthermore,
elementFormDefault
and
attributeFormDefault
require that the prefix is not applied to any of the
locally declared elements such as shipTo,
billTo, name and
street, and it is not applied to any of
the attributes (which were all declared locally). The
purchaseOrder and comment are
global elements because they are declared in the context of
the schema as a whole rather than within the context of a
particular type. For example, the declaration of
purchaseOrder appears as a child of the
schema
element in
po1.xsd
, whereas the
declaration of shipTo appears as a child of the
complexType
element that defines PurchaseOrderType.
When local elements and attributes are not required to be
qualified, an instance author may require more or less
knowledge about the details of the schema to create schema
valid instance documents. More specifically, if the author
can be sure that only the root element (such as
purchaseOrder) is global, then it is a simple
matter to qualify only the root element. Alternatively, the
author may know that all the elements are declared globally,
and so all the elements in the instance document can be
prefixed, perhaps taking advantage of a default namespace
declaration. (We examine this approach in
Section 3.3.) On the other hand,
if there is no uniform pattern of global and local
declarations, the author will need detailed knowledge of the
schema to correctly prefix global elements and
attributes.
Elements and attributes can be independently required to
be qualified, although we start by describing the
qualification of local elements. To specify that all locally
declared elements in a schema must be qualified, we set the
value of
elementFormDefault
to qualified:
<schema xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:po="http://www.example.com/PO1"
targetNamespace="http://www.example.com/PO1"
elementFormDefault="qualified"
attributeFormDefault="unqualified">
<element name="purchaseOrder" type="po:PurchaseOrderType"/>
<element name="comment" type="string"/>
<complexType name="PurchaseOrderType">
<!-- etc. -->
</complexType>
<!-- etc. -->
</schema>
And in this conforming instance document, we qualify all the elements explicitly:
<?xml version="1.0"?>
<apo:purchaseOrder xmlns:apo="http://www.example.com/PO1"
orderDate="1999-10-20">
<apo:shipTo country="US">
<apo:name>Alice Smith</apo:name>
<apo:street>123 Maple Street</apo:street>
<!-- etc. -->
</apo:shipTo>
<apo:billTo country="US">
<apo:name>Robert Smith</apo:name>
<apo:street>8 Oak Avenue</apo:street>
<!-- etc. -->
</apo:billTo>
<apo:comment>Hurry, my lawn is going wild!</apo:comment>
<!-- etc. -->
</apo:purchaseOrder>
Alternatively, we can replace the explicit qualification
of every element with implicit qualification provided by a
default namespace, as shown here in
po2.xml
:
<?xml version="1.0"?>
<purchaseOrder xmlns="http://www.example.com/PO1"
orderDate="1999-10-20">
<shipTo country="US">
<name>Alice Smith</name>
<street>123 Maple Street</street>
<!-- etc. -->
</shipTo>
<billTo country="US">
<name>Robert Smith</name>
<street>8 Oak Avenue</street>
<!-- etc. -->
</billTo>
<comment>Hurry, my lawn is going wild!</comment>
<!-- etc. -->
</purchaseOrder>
In
po2.xml
, all the
elements in the instance belong to the same namespace, and
the namespace statement declares a default namespace that
applies to all the elements in the instance. Hence, it is
unnecessary to explicitly prefix any of the elements. As
another illustration of using qualified elements, the schemas
in Section 5 all require
qualified elements.
Qualification of attributes is very similar to the
qualification of elements. Attributes that must be qualified,
either because they are declared globally or because the
attributeFormDefault
attribute is set to qualified, appear prefixed
in instance documents. One example of a qualified attribute
is the
xsi:nil
attribute that was introduced in Section
2.9. In fact, attributes that are required t