XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

A Path to Enlightenment

A Path to Enlightenment

August 29, 2001

XML-DEV has been busy recently with a number of long running threads drawing in some interesting postings. The underlying theme has been general orientation on how best to understand and come to terms with particular technologies, ranging from Schemas to Web Services. This week the XML-Deviant walks the XML-DEV path to enlightenment to see where it leads.

Web Services Debunked

The first steps on the path are familiar territory: deflating unnecessary hype. This discussion began in response to Edd Dumbill's recent Taglines column. Of particular interest was the accusation that marketing fanfare is overhyping "web services".

The response was mixed; while most agreed that the hype was selling more than could be reasonably delivered, others remained adamant that technologies such as SOAP brought real value, Michael Brennan among them.

We've been doing integrations across the web using XML messaging for several years now -- integrating with CRM systems, order entry systems, synchronizing user profile info with directory services, providing single sign-on solutions that integrate portals with hosted solutions across the web. It's been working just fine. When SOAP came along, we aligned our approach to SOAP. It's still working just fine. And last year we extended our approach to include SOAP-based integrations with desktop productivity tools -- MS Outlook and Excel -- allowing users to leverage our service from non-browser tools, and to be able to synchronize data with applications employed for offline use.

Those who don't see the proof that this works are simply not looking.

Brennan demonstrates what seems to be a common perception: web services are simply a formulation, albeit using a new set of technologies, of what many having been doing for years. Michael Champion expounded this view, suggesting that SOAP over HTTP is simply an alternative to URLs from Hell.

I personally (obligatory disclaimer ...) suspect that SOAP over HTTP will find its niche mainly as a cleaner, more standardized way of doing what people have been doing with HTTP parameters and CGI scripts "forever". I've sweated over the production and parsing of enough URLs from Hell that I grok the SOAP / UDDI / WSDL vision of doing this in a more orderly manner. Whether that provides a solid foundation for Yet Another Paradigm is another matter entirely.

It hardly seems like a new paradigm for application development, does it? In a later posting, Champion also explored answers to the question, what are web services good for?

Another part of the discussion was some clarification of what SOAP actually is. At various times it has been compared to CORBA and similar distributed object systems; you can also find similar comparisons between XML-RPC and CORBA. However, the comparison isn't justified. Joshua Allen provided a clear appraisal of how SOAP fits into the distributed object framework.

You could probably consider SOAP and CORBA as complementary. SOAP to IIOP might be a better comparison. The three "big" object server models out there have been CORBA, EJB, and COM+ -- these three use IIOP, RMI, and DCOM respectively as the primary method to pass information to and from objects. Now that SOAP is on the scene, CORBA, EJB and COM+ don't go away, they just have another way to pass information to and from objects. In fact, before SOAP, there were many ways to get these three different worlds to interoperate -- the difference with SOAP is that the interop layer is based on XML, supposedly easier to implement than something like an RMI/DCOM bridge, and so on. For example, if I have some objects written in CORBA that provide some service, I no longer have to convince all of my customers to install an IIOP communication layer. With SOAP, the layer that calls my CORBA object could be as simple as a UNIX bash script that pipes some text through netcat. So I think of SOAP as being a universal IIOP/RMI/DCOM substitute that mere mortals can type by hand.

Henrik Frystyk Nielsen's explanation was much pithier and to the point.

I would just as a reminder like to point out that SOAP doesn't aspire to be a distributed object system. It is a nothing more than a wire protocol.

And, as Michael Brennan explained, the Web Services Description Language (WSDL) completes the picture by providing functionality that COM and EJB developers have long been using.

WSDL was motivated by concrete experience with early SOAP implementations. For those who tried to develop tools that supported creating client side interfaces that map to a specific service -- as any developer using CORBA, EJBs, or COM is accustomed to -- it was clear that something like this was needed.

After only a brief journey down our path, we've learned an interesting lesson: web services, as embodied by SOAP and WSDL, don't offer any functionality that developers haven't already had for some time. But SOAP-WSDL achieves things in a way that is potentially more open and cross-platform. While these are certainly laudable goals, it doesn't seem like there's as much to the web service revolution as many (would have us) believe.

Seeking Validation

Moving on, it seems that articles like Don Smith's Understanding W3C Schema Complex Types' are helping users come to terms with W3C XML Schemas; other comments on XML-DEV suggest that many are making progress on their migration away from DTDs. This seems particularly true for less ambitious uses, as Len Bullard related.

I think XML Schemas are too hard because we aren't really sure what they are supposed to do... For me they are easy because... most of what I want to represent can be done in DTDs. Still, I find myself creating restricted simple types for reuse to pick up the extra power of regexes and that is a step beyond DTDs.

Peter Piatko reported similar experiences.

I can't say that I totally understand model groups, substitution groups and all of the inheritance rules.

OTOH, I believe I can write simple schemas w/o ever using these features. I am a firm believer that simple tasks should be easy to do. Sometimes complex tasks are hard to do and there is no getting around it, so perhaps some of the complexities of XML Schema are necessary. As long as they don't get in the way of the simple tasks I'd be happy.

However as Joe English warned, no system is an island and others may be far more ambitious.

...how many of these features are you going to encounter as you interact with other data systems? That's the real problem with complex schema languages. No system is an island, and sooner or later you're going to have to understand all what you're being given, not just what you produced. Even with good tools, concepts can be hard to grasp, which is the point.

Comment on this article Does Dodds's stroll down the path of enlightenment clear up any XML confusions? Help illumine the way by posting a comment.
Post your comments

Slightly further down the road, we learned that W3C XML Schemas are about more than just validation. This is no great surprise, but it's useful to see it clearly spelled out. Interestingly Michael Brennan observed that this may be the cause of some of the frustration directed at the XML Schema specification.

I [...] think that issues about things Schemas cannot represent are probably a frustration to those trying to use it as a validation language. I know that we have been adapting some of our integration interfaces to use XML structures that can be adequately expressed with XML Schema. I think you have to question the value of a validation language when you find that you are redesigning your XML structures to accommodate the weaknesses of the schema language. In our case, this means changing our structures such that an element's content model is not identified by an attribute value. RELAX NG can accommodate this, but XSD cannot.

This is an intriguing observation as it implies the conclusion that, if all you need is a schema language slightly more sophisticated than DTDs, RELAX NG may be the appropriate choice. RELAX is solely about validation and is, therefore, likely to be a better fit for that particular use case. Further, if you need only simple datatyping then a mix of RELAX and Schematron may be enough. Rick Jelliffe demonstrated this week that Schematron can be used for typechecking.

Most other schema languages have built-in types. I guess that since people will tend to evaluate schema languages using a check-box, they might put "no datatyping" on Schematron, when really they mean no "built-in" data types (apart from the XPath ones: number, string, boolean).

Pages: 1, 2

Next Pagearrow