XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Painting by Numbers with SVG

March 15, 2000

Table of Contents

What's Wrong with SVG?
Size Doesn't Matter, Style Does
Attributes versus Elements
Micro-Parsing
The Bigger Picture

Last week XML-Deviant commented on the warm welcome that the Simple Vector Graphics (SVG) specification was receiving from the developer community. This week we take a closer look at the technical details of SVG.

What's Wrong with SVG?

Don Park, suspecting SVG was not quite baked, lit the fires under the SVG discussion:

Now that we had a nice "In praise of SVG" thread, I think it is now time for a bit of roasting to see if any loose parts fall[s] off. If SVG is so great, I think it deserves some more peer review.

As a starting point, Park picked out the SVG "path" element. The path element defines the shape of an object using instructions similar to those of a pen plotter, e.g., move here, draw, move here, etc. Here's an example taken from the spec:

<path d="M 100 100 L 140 100 L 120 140 z"/>

(See the specification for more details.) Park wondered whether this was a proper use of XML, and invited the Working Group members to address his concerns:

Why on earth would anyone stuff this much data into a single attribute? Why not use child elements instead of M for moveto in a monster attribute? Why bother with [a] short non-intuitive attribute name like 'd' when its size of its value can be so big? How am I supposed to parametize SVG graphics if everything is hardwired?

Jon Ferraiolo, the editor of the SVG specification, posted a detailed response:

There has been some discussion [on XML-DEV] concerning SVG's <path> element and why the SVG working group decided to pack a whole bunch of information into the 'd' attribute rather than have separate elements for each path command, such as a <moveto> element. In the spirit of fostering communication, I'll attempt to address some of [these] questions.

Ferraiolo explained that the overriding concern lay with the potential size of SVG files. A verbose XML syntax not only increases the size of a file, but also imposes additional memory overheads when constructing a DOM tree.

Two existing languages (both available as W3C Notes), PGML and VML, were used as a basis for SVG. Ferraiolo pointed out that PGML uses a verbose syntax, whilst VML is more compact, and that the VML approach was found to be more acceptable:

The SVG working group thus had a couple of existing languages to study and present to users for feedback. Typically, PGML files would be twice as big as the corresponding VML files. Plain and simple, this size increase was determined to be unacceptable. Thus, SVG's approach to path data has turned out to be more like VML than PGML.

Ferraiolo's replies, and the resulting discussion, serve as an interesting datapoint in the eternal "elements versus attributes" debate.

Size Doesn't Matter, Style Does

Few contributors were convinced that the size savings earned by using attributes over elements were significant. However "Didier's Labs" confirmed that a verbose syntax did cause overheads in the resulting DOM: 

...my conclusion, up to now, is that for sophisticated SVG applications like interactive technical manuals, the size of the DOM is very very important. To have more elements would prevent more sophisticated applications.

Further concerns were raised over the limitations the syntax imposed on XSLT processing. Curt Arnold suggested making the path data a child element:

This form would be much easier to generate from XSLT since xsl:text can be used to append to text content, but there is no current capability to append to an attribute.

David Carlisle suggested that XPath extension functions for XSLT would improve the situation:

So what might be useful for XSL (since XSL seems to have been mentioned quite often in this thread) is a specification of some XPath extension functions for SVG, to break apart an SVG path and produce a node list more easily accessed by XSLT.

This is an important issue. A relatively lightweight DOM is essential to allow SVG to be manipulated within a browser. Server-side generation of SVG graphics, however, will probably make extensive use of XSLT transformations.

Attributes versus Elements

Following further questions concerning the selection of attributes over elements, Jon Ferraiolo explained that the Working Group had decided that elements should contain only readable text content:

...we decided against a syntax with coordinates as character data because we wanted character data only for things that were textual in nature (such as <text> elements).

In a follow-up message, Ferraiolo commented that this would allow better search engine indexing of SVG files, and facilitate its use by screen readers for the visually impaired. Eric Bohlman thought that, with regard to the SVG syntax, accessibility was irrelevant:

Misguided attempts at accessibility are patronizing as well as wasteful. Accessibility of SVG documents to blind users will come either from accessible interfaces provided by "mainstream" SVG display tools (in which case the internal representation is completely irrelevant) or from specialized SVG-interpreting tools (which can be assumed to fully parse the SVG and probably build a DOM out of it; there's no reason to make SVG hack-parse friendly on their account).

Is this requirement really "an SVG document should display its textual content if loaded into a browser that supports only HTML"?

The last point is important and, if true, may provide the explanation for several SVG design decisions. Path data may have been pushed into attributes to avoid it being displayed on non-SVG aware browsers. We could be seeing "legacy" browser implementations affecting the development of future web standards. A definite cause for concern.

Micro-Parsing

The syntax for the SVG path data is not XML. This means that an SVG implementation must also include a "path data parser" to extract the relevant information. In essence the SVG specification is two languages, XML and "path data."

The SVG Working Group recognised the need for additional utility methods to handle the path data. The specification includes custom DOM interfaces, with utility methods to manipulate information stored in the path element. Jon Ferraiolo invited developers to make additional suggestions for utility methods:

We have attempted to put a few utility methods into the SVG DOM in order to make it more convenient for scriptwriters to do their job. The ones that are in the spec now are the ones people in the working group could think of in anticipation and in advance of actually writing useful scripts. If anyone has suggestions for utility methods in the SVG DOM, please send them in.

However, Ferraiolo qualified the invitation, observing that:

...the utility methods most likely to go in are ones where the underlying algorithms are already required in order to implement features in the spec... As people have been pointing out, there is already plenty to implement.

There appears to be an interesting trade-off here. To reduce file sizes to an acceptable level, path data in SVG is expressed in a custom syntax (which in reality could be either attribute or element based). The downside is that additional implementation is required to support this syntax. Don Park commented on the trend towards "micro-parsing":

While I can certainly live with SVG as it stands now, I am concerned over what I think is an increasing trend toward micro-parsing, which might be [a] necessary evil in some XML applications, but detracts from [the] usefulness of XML, DOM, and SAX. I think it started with CSS, which infected XML via HTML, and lately SVG.

The Bigger Picture

Stepping back from the details of SVG, its interesting to observe this discussion in light of recent criticism of the W3C. Here, we've seen the SVG specification editor making frank statements about key design decisions within a public forum. The response was excellent. While not everyone agreed with the decisions made, it at least became clear why they were made. There are benefits to be gained from this. Len Bullard commented on the practical value of open forums:

...the XML-DEV archive and list very often serves as a source for practical design examples to groups not privileged to share the contents of the closed W3C sources. Support for XML-DEV is one means by which OASIS contributes to open forums and open software development among ALL XML developers.

While XML-DEV is far from representing the entire XML community, its clearly an important resource, utilized by more than just its immediate membership.