Sign In/My Account | View Cart  
advertisement


Listen Print Discuss

Converting Between XML and JSON
by Stefan Goessner | Pages: 1, 2, 3

Examples

Now let's look at two examples using the insight we've gained thus far. Microformats are well suited because they are an open standard and short enough for a brief discussion.

XOXO, as a simple XHTML-based outline format, is one of several microformats. The slightly modified sample from the Draft Specification reads:

<ol class="xoxo">
  <li>Subject 1
    <ol>
        <li>subpoint a</li>
        <li>subpoint b</li>
    </ol>
  </li>
  <li><span>Subject 2</span>
    <ol compact="compact">
        <li>subpoint c</li>
        <li>subpoint d</li>
    </ol>
  </li>
</ol>

Now we apply the patterns above to convert this XML document fragment to a JSON structure.

  1. The outer list with two list items is converted using pattern 6.
  2. The first list item contains a single textual content "Subject 1" and an inner list element. So, it can be treated according to pattern 7.
  3. The first inner list is converted with pattern 6 again.
  4. Pattern 5 is applied to the second item of the outer list.
  5. The second inner list is converted using a combination of patterns 3 and 6.

Here is the resulting JSON structure, which is reversible without losing any information.

"ol": {
  "li": [ 
    {
      "#text": "Subject 1",
      "ol": {
        "li": ["subpoint a", "subpoint b"]
      }
    },
    {
      "span": "Subject 2",
      "ol": {
        "@compact": "compact",
        "li": ["subpoint c", "subpoint d"]
      }
    }
  ]
}

hCalendar is another microformat based on the iCalendar standard. We'll just ignore the fact that the iCalendar format could be more easily converted to JSON, and will look at an hCalendar event example, which is also slightly modified so that it is a structured, rather than mixed, semi-structured document fragment.

<span class="vevent">
  <a class="url" href="http://www.web2con.com/">
    <span class="summary">Web 2.0 Conference</span>
    <abbr class="dtstart" title="2005-10-05">October 5</abbr>
    <abbr class="dtend" title="2005-10-08">7</abbr>
    <span class="location">Argent Hotel, San Francisco, CA</span>
  </a>
</span>

Here, patterns 2, 3, 4, 5 and 6 are used to generate the following JSON structure:

"span": {
  "a": {
    "@class": "url",
    "@href": "http://www.web2con.com/",
    "span": [
      { "@class": "summery", "#text": "Web 2.0 Conference" },
      { "@class": "location", "#text": "Argent Hotel, San Francisco, CA" }
    },
    "abbr": [
      { "@class": "dtstart", "title": "2005-10-05", "#text": "October 5" },
      { "@class": "dtend", "title": "2005-10-08", "#text": "7" }
    }
  }
}

This example demonstrates a conversion that does not preserve the original element order. Even if this may not change semantics here, we can do the following:

  1. state that a conversion isn't sufficiently possible.
  2. tolerate the result if order doesn't matter.
  3. try to make our XML document more JSON-friendly.

In many cases the last point may be not acceptable, at least when the XML document is based on existing standards. But in other cases, it may be worth the effort to consider some subtle XML changes, which can make XML and JSON play nicely together. Changing the <abbr> elements to <span> elements in the hCalendar example would be an improvement.

XML is a document-centric format, while JSON is a format for structured data. This fundamental difference may be irrelevant, as XML is also capable of describing structured data. If XML is used to describe highly structured documents, these may play very well together with JSON.

Problems may arise, if XML documents do the following:

  • implicitly rely on element order
  • contain a lot of semi-structured data

As proof of this concept, I have implemented two Javascript functions -- xml2json and json2xml -- based on the six patterns above, which can be used for the following:

  • client-side conversion
    • a parsed XML document via DOM to a JSON structure
    • a JSON structure to a (textual) XML document
  • implementing converters in other server side languages

Future XML document design may be influenced by these or similar patterns in order to get the best of both the XML and JSON worlds.


Comment on this articleShare your experience in our forums.
(* You must be a
member of XML.com to use this feature.)
Comment on this Article


Titles Only Titles Only Newest First
  • Improved version
    2008-01-28 07:30:50 MichaelSchoeler [Reply]

    Hello,


    I am current actively improving the nice XML/JSON conversion functions from Prof. Stefan Goessner.


    Specifically I have merged the two script files into one, made sure it validates with JSLint, and corrected two flaws in the original implementation with regards to empty value handling of strings and arrays.


    Fell free to look at the work in progress at
    http://michael.hinnerup.net/blog/2008/01/26/converting-json-to-xml-and-xml-to-json/


    Best regards,
    Michael Schøler
    Hinnerup Net ApS
    Denmark

  • Look at XSugar for bi-directional translation
    2007-06-27 02:03:10 robertmue [Reply]

    I've recently come across XSugar (http://www.brics.dk/xsugar/). It seems ideally suited for XML-JSON translation, using a single rule set which can be used in both directions.


    From the web site:
    "An XSugar specification is built around a context-free grammar that unifies the two syntaxes of a language. Given such a specification, the XSugar tool can translate from alternative syntax to XML and vice versa. Moreover, the tool statically checks that the transformations are reversible (i.e. bidirectional) and that all XML documents generated from the alternative syntax are valid according to a given XML schema."


    I have used it for some reasonably serious work, including conversion between the normal algebraic form of an equation and MathML, and it works as advertised.

  • Congratulations! Could you help me with Json-lib?
    2007-05-03 13:34:05 CassiuP [Reply]

    I am having some problems in running the Json-lib, could you please help me?

    I have noted that you could use the following command
    successfully:


    JSONArray json = (JSONArray) XMLSerializer.read(aXMLString);


    Then, I am having problems in running this line. I get the following error:


    Exception in thread "main" java.lang.NoClassDefFoundError: nu/xom/Serializer
    at jsonclient.JSonTest.init(JSonTest.java:47)
    at jsonclient.JSonTest.<init>(JSonTest.java:42)
    at jsonclient.Main.main(Main.java:34)



    I was in doubt about the lib files (jars) included in the project. I have include the following:

    commons-beanutils.jar
    commons-beanutils-bean-collections.jar
    commons-beanutils-core.jar
    commons-lang-2.3.jar
    commons-lang-2.3-javadoc.jar
    commons-lang-2.3-sources.jar
    ezmorph-1.0.2.jar
    json-lib-1.1-jdk15.jar
    junit.jar
    serializer.jar
    xalan.jar
    xercesImpl.jar
    xml-apis.jar

    There is some jar file missing? Which jar files have you included in your example?


    Thanks in advance!

  • I have a problem about json
    2007-04-27 02:43:29 peter.wang [Reply]

    XML
    <root>
    <dd>11</dd>
    <ff>22</ff>
    <dd>33</dd>
    </root>
    Convert to JSON
    {"root":{"dd":["11","33"],"ff":"22"}}
    Convert to XML again
    <root>
    <dd>11</dd>
    <dd>33</dd>
    <ff>22</ff>
    </root>
    Now, the sequence is changed. What is the solution, if I don't want it change the sequence?




  • tab is not really optional in xml2json
    2006-11-16 15:39:41 tschaub [Reply]

    Thanks for the great code. The usage on http://goessner.net/download/prj/jsonxml/ suggests that tab is optional for both functions.


    If you omit the tab argument in xml2json, you get "undefined" prepended to your json. Perhaps the return was supposed to be something like:


    return "{\n" + (tab ? tab + json.replace...


    instead of


    return "{\n" + tab + (tab ? json.replace...


    Thanks again.

  • What happens to empty elements?
    2006-10-09 12:24:03 babbage [Reply]

    Hi, great article.


    I tried out this converter with the document here:


    http://www.lisa.org/standards/tmx/tmx.html#AppSample


    It converted to JSON fine, but when I converted that back to XML, I ran into a problem: in the source document, the "header" element is empty, but it contains a closing tag (it's defined as containing optional other elements).


    The json2xml function removes the closing tag, so it's not round-trip. Any thoughts on this?

  • Interesting!
    2006-06-18 10:17:47 ThomasFrank [Reply]

    An interesting article.


    I wrote a XML to JSON converter in JavaScript about a year ago (that doesn't use to the DOM for conversion).


    It can be found here http://www.thomasfrank.se/xml_to_json.html


    I will try to compare it with this one for speed and results shortly.

    • Interesting!
      2006-06-19 10:25:38 stefan@goessner.net [Reply]

      I searched the web to reference all similar solutions. Somehow I must have overlooked your weblog .. sorry and thanks for the link.


      Interestingly you also added an XML parser to your implementation, which is purely based on regular expressions .. does it really work reliably?


      I preferred to rely on available xml parsers.


      If you do some comparisons, I would be interested in the results.


      thanks
      --
      Stefan


  • Javascript?
    2006-06-13 04:33:29 chrisbp [Reply]

    Where is the script function?

    • Javascript?
      2006-06-14 04:20:38 stefan@goessner.net [Reply]

      The javascript functions can be found here.
      http://goessner.net/download/prj/jsonxml/

  • #text should always work
    2006-05-31 15:49:30 bkc [Reply]

    I suggest that even for the simple element text case, e["#text"] should also always work.


    This way, references to the element's text will work whether there are attributes or not on that element.


    Programmers who are certain their xml data will *always* have a known format can choose to use the less verbose e.text reference.


    Also, an element named 'text' should take precedence over the text content of a mixed-mode element. That is, e["#text"] would reference the text part, and e.text would reference the child element named 'text'.





    • #text should always work
      2006-06-14 04:18:08 stefan@goessner.net [Reply]

      This is a good point. Unfortunately we cannot have both 'e' *and* 'e["#text"]' resulting in the same textual content.


      I agree, that having always to use 'e["#text"]' for accessing an element's text generally is more consistent.


      Frequently programmers know their xml/json structure and want to use the less verbose 'e' form, which is what you also mentioned.


      A possible solution might be to let the user control behaviour by an additional configuration parameter.


      Please note however, that there is only an 'e.text' when element 'e' has a child element 'text'.