Menu

JSON on the Web, or: The Revenge of SML

July 5, 2006

Simon St. Laurent

Back when XML seemed all new and shiny, suggestions that it might in fact be too large, too complicated, or even slightly broken went over rather badly. The xml-dev list rang with battles over whether further simplifications were a good idea (since we'd just lost all that SGML capability), and whether a "Simple Markup Language" could even be useful. In the end, the controversy quieted. Then XML.com editor Edd Dumbill wrote at the end of a response piece that "doubtless the acid test for SML will be that of time." Six and a half years later, it may finally be time.

YAML Emerges from SML

The SML project faded into quiet for a while, producing Common XML (yes, I was one of the writers), a set of guidelines for using XML conservatively, while a group of SML folks found common ground with another effort, gave up on XML syntax, and produced YAML (YAML Ain't a Markup Language).

YAML files are pretty simple, using indentation (spaces only!) to identify containment, and using dashes, colons, brackets, and commas much like they are used in scripting languages. The full specification, including 23 pages of introduction and tutorial, is 85 pages long--longer than the XML it sought to replace, but incorporating much more information on processing and handling. They even offer a quick reference card.

Okay, so YAML came to fruition, even though it fell completely off the radar of the more XML-obsessed. Did it go anywhere?

There are a number of implementations for a variety of languages, and YAML seems to have penetrated into Ruby thought pretty well. There's a YAML Cookbook for Ruby. YAML is built into Ruby on Rails, is used for its configuration files, and drives developers to write cheerful blog posts with titles like "YAML Rules the Config File and Data Serialization Universe."

Even Simpler

YAML clearly isn't XML, but I'm not sure that it was quite what developers had in mind when they started out looking for Simple Markup Language. It supports a lot of structures beyond the basic, labeled nested-nodes approach described in the early SML presentations, and the 85 pages of the YAML spec includes a tremendous amount of data structure functionality.

Was there an uprising among disgruntled SML followers shocked by how large YAML had turned out to be?

No--there didn't need to be one. (There was a brief thread in 2002, but that seemed to work out happily.)

As it turned out, there was a similar effort going on simultaneously, with a similar but less ambitious set of goals: to represent JavaScript data structures. Actually, the goals were a little broader, as the result is used by many languages, but there was a common thread of belief that XML was way too much for what they wanted. It also had the advantage of building right on an existing programming language, JavaScript, and could easily describe itself as JavaScript Object Notation (JSON).

JSON still supports more than the single structure SML had offered--it has two basic data structures: name-value pairs, and lists. It also offers a few data types, from strings to chars to various types of numbers. They have a few examples to show how much lighter it is than XML. Look at the JSON carefully!

It's a subset of YAML, as it turns out. Well, not a perfect subset at first, but with the departure of JavaScript comments, it's very close. JSON can be a perfect subset of YAML, if colons and commas are followed by spaces. (The current Internet Draft states that "Insignificant white space is allowed before or after any of the six structural characters," but doesn't mandate it. There are also lots of toolkits for using JSON in a variety of languages, so it isn't limited to JavaScript or YAML environments.

(And if you'd like a perversely exciting possible combination of technologies, explore this call for "something like RELAX NG-Compact used as a validation schema language for YAML and JSON.")

Takeoff

Both YAML and JSON are finding happy homes with growing numbers of programmers. YAML's use in Ruby on Rails may well catapult it into broader use for configuration files, as the many developers exploring Rails take what they learn there to other projects.

JSON, on the other hand, is heading directly for the territory that XML had marked as its target ten years ago: the Web. While Ajax may mean Asynchronous JavaScript and XML, the XML part doesn't actually have to be XML. Anything that can serve as a neutral data transfer format between the client and the server will work.

Ajax applications have used XML, HTML, and plain text for that communication, but JSON is coming up fast. If a server sends JSON, converting that JSON into JavaScript objects is quick and painless, and the results are easier to navigate and process than XML. It has the advantage of speed, of escaping the "thou shalt communicate only with thy original server" rule (with a more formal proposal claiming better security), and good support in a number of toolkits. You can even (of course) convert back and forth between JSON and XML if you want to.

It's too soon to tell if JSON will overtake XML for the cases where it's most useful: data structures that don't need a tremendous amount of type information. I'm guessing XML will continue to be useful for documents, and that applications that want huge amounts of enforced structure will stick with the more sophisticated (and complicated) type structures offered by W3C XML schemas. Still, it's good to see XML getting some heavy competition after all of these years, and hopefully it'll reawaken some innovative thinking on the XML side. As Rick Jelliffe concluded in an early article on SML, "it is good to see some creativity and ingenuity at work. There is nothing wrong with that recipe."