A follow-up on XML schemas
On 3/21/2001, my tip entitled "What's up with XML Schemas" appeared on the searchmiddleware.com Web site. In the past 6 weeks, there've been some interesting developments in this area (and in my understanding of the subject). The time is ripe for a follow-up.
To begin with, there are now some good tools that can help to validate schemas, one of which even purports to perform some quality control checks for conformance "under all the valid constraints that apply to schemas." These are as follows:
- XSV, an XML Schema Validator from University of Edinburgh/W3C (caution: beta softare), available on the Web at https://www.w3.org/2001/03/webdata/xsv or as a Win32 self-extracting download at ftp://ftp.cogsci.ed.ac.uk/pub/XSV/XSV12.EXE
- IBM's XML Schema Quality checker: http://www.alphaworks.ibm.com/tech/xmlsqc
The latter tool is especially helpful for those learning about schemas, because it attempts to deliver its error messages using relatively straightforward, relatively jargon-free language. (I will digress long enough to state that this kind of design philosophy would be entirely welcome in more markup tools, particularly validators).
Next, I'm starting to understand that schemas add value above and beyond what SGML DTDs can do for XML markup definitions. This is true for two significant reasons that I've been learning more about lately (and I'm sure I'll discover more as I become more schema literate, and as the proposed recommendation moves ever closer to final recommendation status):
- XML Namespaces are becoming increasingly important, not just for identifying specific XML applications (or rather, the markup definitions that go along with such applications), but also for modularization of those applications (the recently finalized XHTML modularization recommendation, published on April 10, 2001, is an excellent case in point https://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/). Because schemas are namespace-aware (which DTDs are not), they can leverage meaningful subsets of XML-based markup, which is where XHTML modules come into play.
- Schemas support a much richer and better-defined collection of datatypes than do SGML DTDs. By recognizing and identifying various types of numbers, strings, Boolean values, and so forth, XML Schemas make it possible to apply stricter datatypes to XML documents. Therefore, they also make it much easier to apply data type and value constraints to document contents and to document attributes, thereby making it much easier for document designers to build data value, range, or type checks into the definitions of the documents themselves. Though this may sound rather abstruse and arcane, it's of incredible value because it means that designers can build input or content checks right into the document definitions themselves, thereby allowing such checks to become part of the document validation process. Thus, not only will validation check for well-formedness and compliance with document structure and syntax constraints, it can also check for value and content constraints as well.
Then, too, there's the notion that more and more XML specifications are starting to use or require XML schemas or namespaces (which amounts to a requirement for schemas, given that there's no way to validate namespace information through a DTD). Thus, it's inarguable that Schemas have an increasingly important role to play in the further definition of XML markup and applications.
That's why I repeat my pointers to the XML Schema specifications from my previous tip on this subject. You'll find this information in three parts, including:
- Part O: XML Schema Primer (https://www.w3.org/TR/2001/PR-xmlschema-0-20010316)
- Part 1: XML Schema Structures (https://www.w3.org/TR/2001/PR-xmlschema-1-20010316)
- Part 2: XML Schema Datatypes (https://www.w3.org/TR/2001/PR-xmlschema-2-20010316)
Look further for additional information on schemas at sites like www.xmlhack.com, www.xml.com, www.devx.com, and here at www.searchmiddleware.com, among others. A quick visit to Amazon shows that there are 3 books on XML Schemas already underway, with publication dates between June and September, with more surely on the way. Shortly, there should be no shortage of good information on this increasingly important XML topic!
Send an e-mail to Ed at firstname.lastname@example.org if you have questions on this or other XML topics.
Ed Tittel is a principal at LANWrights, Inc., a wholly owned subsidiary of LeapIt.com. LANWrights offers training, writing, and consulting services on Internet, networking, and Web topics (including XML and XHTML), plus various IT certifications (Microsoft, Sun/Java, and Prosoft/CIW).
Did you like this tip? We'd like to know what you think, so if you email us we can find out.
XML in a Nutshell : A Desktop Quick Reference
Author : Elliotte Rusty Harold and W. Scott Means
Publisher : O'Reilly & Associates
Published : Jan 2001
XML in a Nutshell covers the fundamental rules that all XML documents and authors must adhere to, detailing the grammar that specifies where tags may be placed, what they must look like, which element names are legal, how attributes attach to elements, and much more.