Sign In/My Account | View Cart  
advertisement


Listen Print Discuss

Profiling XML Schema
by Paul Kiel | Pages: 1, 2, 3

Conclusion

Examining what schema designers are actually implementing can indeed reveal a usage profile of XML Schema. It is in this profile of practice that our five-year-old's personality emerges. The clearest message is one of simplicity. The most commonly used constructs involve merely creating reusable types, assembling them into sequences of elements, and augmenting them with enumerations. Many of the more complex features went unused. In addition, the test cases also reflected explicitness in their schemas, as evidenced in the avoidance of mixing or abstracting content and the qualifying of element form defaults. Adhering to the design patterns reflected in this usage profile will serve schema designers well.

Appendix: The Data

The data in these tables indicate the results of my research. They were all downloaded in early September 2006 from their respective websites (many of them are listed here). Figure 1 is a summary, Figure 2 indicates how many schemas contain the XML Schema design feature listed, and Figure 3 shows the number of times the feature occurred.

Summary of data.
Figure 1. Summary of data

Figure 2
Figure 2. Number of schemas using XML Schema features. (Click for full-size image)

Figure 3
Figure 3. Number of occurrences of XML Schema features. (Click for full-size image)

Figure Notes

A few duplicative schemas were removed from the analysis, such as the schema for schemas (XMLSchema.xsd), which was commonly distributed with many libraries. ACORD also offers no namespace equivalents of their schemas. For this analysis, the namespaced versions were used. In both the HR-XML and OAGi test files, the developer or "non-standalone" versions of the schemas were analyzed. While there are no substitutionGroups in the OAGi schemas, the global element design is intended to enable substitutions as an extension point. The W3C list of schemas includes mathML.


Comment on this articleShare your experience in our forum.
(* You must be a
member of XML.com to use this feature.)
Comment on this Article


Titles Only Titles Only Newest First
  • Lockout Services Open Door Locks Los Angeles 1-323-678-2704
    2009-06-11 15:28:32 whats [Reply]

    Lockout Services Open Door Locks Los Angeles call 1-323-678-2704
    #1 24 Hour Locksmith Los Angeles
    call 1-323-678-2704
    Locksmith services Los Angeles, including locks installation, doors locks repair, doors locks rekey, locks and keys products or services the best value and commitment to customers 100 satisfaction guaranteed.
    24 Hour Emergency Locksmiths Service
    Burglary Repairs Los Angeles County


    Professional Lock Repair



    Professional Door Lock Replacement



    Professional Lockout Services



    Professional Door Locks Rekeying



    Immediate Response 24 hours a day



    Doors Locks Installation



    Automotive Locksmith




  • Profiling XML Schema
    2006-09-26 11:40:16 MarkCrawford [Reply]

    Nice analysis as far as it goes. I would have expected to see some of the heavyweights in the B2B horizontal standards space such as X12 CICA and UN/CEFACT XML NDR. I would also have expected to see the UBL NDR included. And lastly, perhaps something from the public sector such as the Department of the Navy XML NDR. (Dislosure - I have been involved in all of these). It should be noted that both the UBL NDR and the UN/CEFACT NDR have been the basis for several of the NDRs in this article, to include CIDX, ACORD and OAGi. We are also beginning to see a convergence in this space, with the vertical standards development organizations (ACORD, OAGi, CIDX, AIAG, RosettaNet) establishing a path for transition to the UN/CEFACT NDRs - as well as the corresponding Core Components methodology for the underlying data models.

  • A nice start, but...
    2006-09-22 20:17:08 Robin Berjon [Reply]

    ...this article would largely benefit from being expanded into a more thorough analysis. On the one hand I do mean this as a compliment, but on the other I can't help noting that a lot of the interesting data has little or no useful analysis.


    On the more editorial side, listing "problems in the middle" is useful, but just saying "Declaring default values for data in the XML instance. I've blogged about default values before." for one of the items is not. There may or may not be interesting content on the other side of that link, but without at least a one sentence description of what it may be I don't think many people are going to bother. Likewise, I don't want to know that the author thinks that "the analysis searched for "xsd:attributeGroup" resulting in matches for both declarations and reuse or "@ref"", I would be much more interested in reading the article after that bug in the analysis method as been fixed, since it doesn't feel exactly difficult.


    More importantly I think there lacks an analysis of the type of schemata that were investigated, a characterisation of the test set. For instance the lack of mixed content and the poor showing of list and union just scream out that the data set is heavily data oriented as these features tend to be very hard to avoid for anything document related (for mixed content, even something that is largely data but captures humans readable text at any point either very likely requires mixed content, or is not designed with I18N in mind and will need to add it, perhaps if only for ruby annotations).


    Finally, it has surfaced in studies last year that most XML Schema schemata were invalid. Were all of those in this test set validated? Is there a relationship between the features they use and their validity (perhaps, again, one of simplicity)? I think that would be interesting to know.


    I guess I'm ranting because I haven't seen an interesting article on xml.com in a long while, all in all I think this one raises the bar :)


  • you sure yo haven't missed substitution groups
    2006-09-22 11:20:30 craigsalter [Reply]

    Good article. I've seen substitutionGroups used heavily in the OAGIS-8 schema. Is there a chance you missed these or did this project not make it into your data set?

    • you sure yo haven't missed substitution groups
      2006-09-22 12:07:17 xmlhelpline [Reply]

      You are correct in that the 8.x version of OAGiS used substitutionGroups. I used the 9.0 version for my analysis, but made note of the fact that sGroups are used for extensibility. And in the data section at the end, I mention that their global element design was in part intended to accomodate later substitution via sGroups.

  • Thank you, Paul
    2006-09-22 02:23:53 WillemF [Reply]

    Reminds me of the state of OOD/OOP in the early '90's... in those days C++, Smalltalk etc. often was viewed as a "silver-bullet" that you easily use to shoot your own foot to pieces...


    Efforts like Pauls are bringing XML-schema into the realm of real, modern, working production solutions.


    Thanks Paul, I will use this as a reference to direct the focus of our CodeXS tool where required.

  • Profiling XML Schema-complexTypes by Restriction
    2006-09-21 19:19:04 Robert Leif [Reply]

    The subtyping of complexTypes by restriction has not been widely used because the original specification does not allow this between schemas with different namespaces. Thus, XML schema does not allow the creation of reusable generic types (templates). This is clearly needed for object oriented design. I hope that this will be remedied in the next version of the standard.
    Bob Leif

  • research paper
    2006-09-21 11:20:25 xmlhelpline [Reply]

    An excellent research paper was presented at the XML 2005 conference on this topic.


    http://www.idealliance.org/xmlusa/05/call/xmlpapers/49.1704/.49.html


  • Nice article!
    2006-09-21 07:20:38 mrowell44 [Reply]

    At the end of the day the profile of XML Schema is what is used in industry by standards, tools, and applications.


    Again, nice job Paul!