Designing Schemas for Business to Business E-Commerce

June 15, 2000

Leigh Dodds

In a fast-paced session at XML Europe, Arofan Gregory, Lead Scientist and Manager of the XML Common Business Library, provided an overview of the role of XML Schemas in e-commerce and gave some guidelines for good design.

The Role of Schemas

In the introduction to his presentation, Arofan Gregory stressed the importance of XML schemas in developing robust, extensible business-to-business (B2B) e-commerce applications. Gregory believed the lack of formal validation in EDI messaging leaves the framework open for "abuse". Custom extensions to standards, leading to "tag bloat", means that frequently the same information can be expressed in multiple ways.

Use of XML brings the key advantage of providing DTD-based validation using generic tools. However, DTDs do not have the data-typing features which are essential for EDI applications. This is an area in which XML schema languages excel, making them an enabler for XML/EDI application design.

Schema languages not only provide the rich data typing associated with ordinary programming languages, but also include the capacity for new types to be defined. This means that typing can be customised for particular application contexts, e.g. to enforce particular number formats, field lengths, etc.

The second key advantage of schema languages is the provision for breaking a schema into separate components. This encourages reuse of existing defintions, leading to a greater chance that interoperability can be achieved. Reuse of schemas definitions was a central theme in Gregory's presentation.

Design Guidelines

Throughout the presentation, and in the accompanying conference paper Gregory provided many useful guidelines for creating well-designed e-commerce schemas. As an example of what not to do, Gregory observed that early efforts to apply XML simply redefined the EDI message using an XML syntax. This yields little advantage as no extra semantics have been added.

Gregory commented that there is a lot to be gained from looking at existing EDI frameworks and standards. EDI applications have been in use for nearly 10 years, so the analysis to define data types has been well-tested in the field. Existing type definitions can be used as a starting point when developing your schemas. The key issue is to get the data types clearly defined and agreed with trading partners. Structural differences are less important, as they can be removed by transformations.

Data types should be defined with extension and refinement in mind. To this end, the core structures should be minimal--additional constraints can be applied though additive refinement of the type. Gregory stressed that subtractive refinement (starting with a complex type definition, and removing unnecessary constraints) is not as useful and difficult to reuse.

The schema itself should be designed from the perspective of the business process, and not a particular application. This gives a greater degree of flexibility and better future-proofing as it makes no assumptions about either who or what will be processing the messages.

As use of XML schemas grows, the potential for reuse expands. Alongside the Common Business Library (xCBL) Gregory highlighted several other efforts worthy of a closer look. BizTalk provides a great repository of schemas, and RosettaNet is an excellent source of information on process and data set analysis. However Gregory singled out ebXML as the activity most likely to yield successful results and close the gap between the XML and EDI communities.

With respect to particular schema languages, Gregory clearly supported the efforts of the W3C Schema Working Group. He also confirmed that while xCBL currently holds schema components in SOX and XDR formats, these will shortly be supplemented with XML Schema definitions.