W3C smiles on multimedia

December 20, 1997

Seybold Report on Internet Publishing
Vol 2, No 4

Proposes spec for synchronizing time-based media within Web pages

Last month a working group within the W3C issued its first draft of a method for synchronizing multimedia objects within HTML documents. The proposed language, the Synchronized Multimedia Integration Language (SMIL, pronounced "smile"), aims to create a straightforward way for authors to control time-based media in the context of HTML documents.

The SMIL draft can be found at

SMIL is not an attempt to specify the data formats of time-based media, such as audio, video, animations or presentations. What it does is provide a way for authors to schedule the playing of time-based media within their documents. For example, authors can indicate where a clip should begin playing and where it should stop, where it should appear in the layout, and whether it should be launched in a separate window or just replace the current window. Objects can be played in sequence or in parallel with one another.

There are also provisions for users to control the presentation by means of control buttons, such as the stop, fast-forward and rewind buttons of a video recorder. Objects can be started at specific points (e.g., fast-forward to a given frame, or open at slide five), and the playback can be controlled from within the HTML document (e.g., play in slow motion).

The multimedia objects, which are referenced by simple URL hypertext links, may be any of data type. Whether the object will play will depend on the browser, plug-in or Helper application.

Although the draft is likely to change, the W3C is encouraging developers to implement prototypes based on the working specification.

Both XML and BNF notations are given for the language, sending another signal that XML is gaining ground as a way to represent abstract notation for the structures of Web documents.

Philipp Hoschka, chair of the Synchronized Multimedia working group and editor of the SMIL specification, noted that SMIL was purposely designed to make it easy for authors to implement it with current Web authoring tools. The use of tags to describe scheduling means that embedding and controlling multimedia in Web pages can be accomplished with ordinary text editors.

Implications. The Web has already become a medium for presentations, and SMIL lays the groundwork for a new generation of hypertext-based presentation programs that combine time-based media with text and graphics without requiring scripting in Lingo.

More important, though, is SMIL’s relationship to the broadcast and entertainment industries. Settop boxes and cable modems are still far from ubiquitous, so there is time to establish a vendor-neutral method of linking time-based media and static media together. SMIL is a milestone in this regard, establishing a framework for publishers to link audio and video to their HTML Web pages.

Of course, the SGML community proposed such a language years ago (HyTime), but once again it failed to work the Web connection to its advantage. SMIL is much less robust than HyTime, but it has representatives from Microsoft and Netscape as authors, and so is much more likely to be implemented in products that millions of people use. HyTime, for all its strengths, has proved to be so abstract that most developers have ignored it.

Also in contrast with HyTime, the W3C has made sure to invite participants from relevant industries—CD-ROM, interactive television, the Web and multimedia.

Our take. It used to be that the W3C didn’t really set standards—it merely codified a baseline from what Microsoft and Netscape had already set in their browsers. Today, that’s no longer the case: The W3C is now ahead of both Netscape and Microsoft in both its HTML and style sheet specifications. SMIL is but the latest example that the W3C is becoming an important standard-setter. The Web is beginning to emerge as a potential rival to television and radio, and it’s encouraging to see the W3C out in front, setting up a document framework that anyone can implement, even with BBEdit.

Mark Walter