Amidst the many new recommendations approved by the W3C on January 23, 2007, you'll find a brand spanking new version of XPath—namely, 2.0—to ponder. Formally entitled TR/xpath20/XML Path Language (XPath) 2.0, this document caps off what some might call the "Holy Trinity" of XML, all of which have been recently revised (or added) of late: XPath 2.0, XQuery 1.0, and XSLT 2.0. In the words of the recommendation itself "XPath 2.0 is an expression language that allows the processing of values conforming to the data model defined in [XQuery/XPath Data Model (XDM)]. The data model provides a tree representation of XML documents as well as atomic values such as integers, strings, and Booleans, and sequences that may contain both references to nodes in an XML document and atomic values. The results of an XPath expression may be a selection of nodes from the input documents, or an atomic value, or more generally, any sequence allowed by the data model."
In somewhat simpler terms this means that XQuery and XPath work together to let XML users locate and interrogate XML documents in general, and to navigate around inside a tree-structured representation of such documents for systematic end-to-end processing. XQuery handes the interrogation, and XPath the navigation. XPath expressions provide a way to address one, some, or all of the nodes in any tree representation of some XML document.
What's different between XPath 1.0 and 2.0 is that XPath 2.0 adds richer data types and gains the ability to access type information that validating documents through XML Schema can provide. In fact, strictly speaking XPath 1.0 is a subset of XPath 2.0, where 80% of the latter comes from the former. The other 20% is what's of greatest interest and includes the following kinds of materials and mechanisms:
- XPath and XQuery are best considered as two parts of a common whole, so you must understand XPath 2.0 both in terms of the XPath 2.0 Requirements as well as the XML Query language requirements.
- XQuery-related additions include query-wrappers that include namespace declarations, schema imports and function definitions as well as element constructors. Because XSLT 2.0 is the other XPath partner in the trinity, but doesn't need these capabilities (it's got its own), this stuff is not part of the common subset that applies to both XQuery and XSLT.
- Schema support means that XPath 2.0 gains support for all the XML Schema primitive types (which consists of 19 simple data types, including dates and time, URIs and other complex items, as well as numbers, characters and other atomic types). Various related functions necessary to process and construct data types in XML Schema now also work in XPath 2.0, as described in "XQuery 1.0 and XPath 2.0 Functions and Operators."
- The already cited XDM document includes complete details on what kinds of values an XPath 2.0 expression can produce. At a high level, we need only observe that such expressions can yield simple-typed values or sequences of nodes or simple-typed values (as stated in the abstract quoted earlier as well).
- Sequences define the key focus for XPath, and a sequence is best understood in the light of a short set of rules that Evan Lenz defines in his "What's New in XPath 2.0" story for XML.com:
- Xpath adds some important keyword operators including sequence operators such as for, conditional expressions, quantifiers, and set operations (intersection, difference, union). There's also an except operator that allows an operation to apply to all members of a sequence except for certain specific members. You'll also find lots of typecasting or coercion keywords as well.
1. Everything is a sequence: that's because in XPath 2.0 all expressions return sequences.
2. Sequences are shallow: one sequence may not nest inside another, nesting simply produces a single sequence where the members of the nested sequence appear between the members of the top-level sequence, ad infinitum.
3. Sequences are ordered: XPath 2.0 understands and represents clearly the order in which sequences occur, and preserves or creates whatever order you specify for results. In XPath 2.0 sequences take over for what was called node-sets in 1.0.
I have to agree with Lenz's assertion that XPath 2.0, despite a relatively low percentage of change, is a major reworking of and extension to XPath 1.0. It offers lots of power and capability that XML content developers should find interesting, compelling and useful.
About the author
Ed Tittel is a full-time writer and trainer whose interests include XML and development topics, along with IT Certification and information security topics. Among his many XML projects are XML For Dummies, 4th edition, (Wylie, 2005) and the Shaum's Easy Outline of XML (McGraw-Hill, 2004). E-mail Ed at firstname.lastname@example.org with comments, questions or suggested topics or tools for review.