Profiling XML Schema
XML Schema is now 5 years old, having matured from a newborn into an active youngster. So what have we learned about this young one's personality? We've always known it was complex. Indeed, the original debate about whether to make it a Recommendation indicated concern. (See Last Word and Questionnaire.) This rich toolset has caused schema designers to wonder which features they should or should not use. If we analyze what people are actually implementing, perhaps we can glean some guidance. I decided to embark on a quest to see if we can put together a profile of XML Schema based on experiences thus far.
A profile is a set of agreed upon practices reflecting the most commonly accepted usage patterns of a given technology. A usage profile of XML Schema would indicate a series of features that are commonly implemented and supported in tools.
The concept of a profile of XML Schema has been debated over the years. In 2004, the Web Services Interoperability Consortium went so far as to charter a work group (PDF) to examine the idea of a formal profile for XML Schema. It led the W3C to hold a Workshop on XML Schema 1.0 User Experiences, where input from many sources was culled into a plan of action. The resultant work in the XML Schema Patterns for Databinding Working Group is ongoing, having recently issued a draft document.
In addition, many industry consortia have issued design guidelines and/or patterns for developing libraries of schemas according to their profile. Having read many of these either formally or informally, they are often explicit about what features of XML Schema are allowed or disallowed. Indeed, tools for addressing enforcement of schema profiles have emerged. Schematron is often used to add additional constraints on top of the ones in a schema. Mindreef's SOAPScope Server has explicit support for creating customizable profiles of schema, offering a standard check box listing of constructs that are enforced in test cases. However, I wondered if there was a cross-industry usage profile.
Strengths and Weaknesses
The first stop in the search for a profile is with an analysis of schema itself. In fact, for some time now, we've known about the pluses and minuses of many schema design techniques, as put together by Roger Costello on his Best Practices website. He has done an excellent job of gathering commentary and opinion from implementers, and assembling it into a cohesive analysis of the benefits and drawbacks of many design criteria. Wise schema designers have referred to this site for years.
On a practical level, schema designers may also wonder what impact each feature of XML Schema has down the road when it comes time for implementation. Some constructs are ubiquitous and well supported in IDEs and other tools such as code generating software. However, there are some that are not commonly used and can cause problems when coders are confronted with a lack of tool support.
What Are Schema Designers Actually Doing?
In the next phase of my profile quest, I wanted to take what Costello has done a step further and ask: what features of XML Schema are folks actually using? Is there a consensus of opinion on the most common constructs? Are there features schema designers are avoiding? I accumulated data on over 1,400 schemas from numerous standards consortia to see if there is a common XML Schema profile reflecting a consensus of practice.
I focused on consortia schemas thinking that these should reflect a group's consensus of design criteria as well as have a disproportionate impact on the marketplace, as they are standard and implemented many times across a domain. These schemas are also all freely available.
I examined schemas from the following organizations:
- The Open Applications Group (OAGi)
- The Open Travel Alliance (OTA)
- Human Resources XML (HR-XML)
- Chemical Industry Data Exchange (CIDX)
- IMS Global Learning Consortium (IMS)
- Association for Retail Technology Standards (ARTS)
- Mortgage Industry Standards Maintenance Organization (MISMO)
- World Wide Web Consortium (W3C, including mathML)
- Global Justice XML
There are many other consortia that could, and with infinite time would, be added to this analysis.
Many of the more mature tools have a high level of support for XML Schema features. In particular, IDEs that edit schemas have good support even for problematic features such as
xsd:union elements. The problem with tool support comes in two forms. First, some have chosen not to support selected schema constructs. This amounts to a profile by design. Secondly, "best-of-breed" tools can offer support only for the most common schema constructs in their early releases. As the tool matures, it may plan to add support for additional features. I've blogged about tool support of XML Schema and shown references to a few code-generation tools and their publicly available claims of support.