Tim Berners-Lee on the W3C's Semantic Web Activity
March 21, 2001
The World Wide Web Consortium has recently embarked on a program of development on the Semantic Web, Director Tim Berners-Lee's vision of a machine processable Web. I spoke with Berners-Lee to find out the reasons behind the new Semantic Web Activity, and how he saw it relating to the rest of the XML world.
Edd Dumbill: Why has the W3C started the Semantic Web activity?
Tim Berners-Lee: The W3C operates at the cutting edge, where relatively new results of research become the foundations for products. Therefore, when it comes to interoperability these results need to become standards faster than in other areas. The W3C made the decision to take the lead -- and leading-edge -- in web architecture development.
We've had the Semantic Web roadmap for a long time. As the bottom layer becomes stronger, there's at the same time a large amount falling in from above. Projects from the areas of knowledge representation and ontologies are coming together. The time feels right for W3C to be the place where the lower levels meet with the higher levels: the research results meeting with the industrial needs.
ED: Before a W3C Activity can start, the members must vote for it. Why did they vote for the Semantic Web Activity?
TBL: There's always a danger when explaining why something as broad as this is important -- it's easy to pick an example which understates the case and then undermines the value. The generality is what is devastatingly valuable and excites people.
A lot of people see [the Semantic Web] as a generic solution to application integration. Those people who can remember pre-Web documentation systems saw the Web as a tool for integrating those documentation systems -- the same people see the Web as an integration platform for their diverse information applications, solving the N-squared problem.
The recent RDF Interest Group meeting was very exciting, because there was a strong feeling that things were coming together. The number of people solving problems with RDF application tools is increasing. Take calendaring for example; there were five people in the room working on such systems based on RDF.
There are also a lot of Members who have serious need for ontologies. There's a clearly understood need for ontologies in a large number of industries, and a ripe need for standardization, with things like OIL and DARPA's DAML effort. We're expecting ontology work to come into W3C as a Working Group quite soon.
ED: RDF, one of the core Semantic Web technologies, has had a bad image in the past. How will you get round this?
TBL: The XML syntax has been designed to make it look like something somebody might write: this looks odd to the Knowledge Representation folks. The RDF model itself is simpler than the XML model, but the syntax which maps between them is more complex than either.
My sense from the DAML work is that people who use RDF for knowledge representation are quite happy to use angle brackets. It's a myth that RDF is more complicated, coming from the fact that the XML syntax has more than one option in an effort to make it something that an XML designer would have done.
The other thing the myth comes from is that some things were included in the RDF spec, such as the containers, which a lot of people don't need. The concepts of RDF properties and RDF Schema classes have become the basic requirements for learning. There's a possibility of reorganizing the spec to present these first.
ED: But what about Perl hackers, HTML authors, etc? How will they get to grips with RDF?
TBL: I think there will come a time when the prevalence of graph manipulation tools will be more alluring than the equivalent at the XML level. Command line tools for RDF are starting to appear now, and APIs and so on... The test is "if I decide to use RDF, what do I have to do?". There are tools now where you just write down an ontology and you can use RDF tools. And there are lots of APIs coming on.
One by one, individual people are being won over to RDF. I believe that will only continue. There were a huge number of Gopher sites. One by one people realized they could do more with the Web, as it's more powerful and generic. They moved from the tree model to a web model. Similarly, moving from XML to RDF is moving from a tree model to a web model.
There may be specific areas in which an open source project tier decides to use RDF to represent information, etc. It's clearly starting to pick up now, and nobody thinks it's about to stop.
ED: Several of the areas the Semantic Web addresses seem to overlap with areas the W3C is pursuing such as XML Schema and XML Protocol. What's the relationship of the new Activity to these existing ones?
TBL: XML Schema is an interesting example... One of the things that XML Schema does is provide a formal model both of the schema and of the XML document. Therefore if you have a process which takes in a document and represents it in terms of that model, you could then write a schema rule in any Semantic Webs rules language.
XML Protocol -- I'll tell you about the way I think this'll fit together. Last year at WWW9, we heard a number of presentations on how SOAP can relate to RDF. As the XML Protocol Working Group puts together their spec, I hope they'll be able to see the opportunity for convergence. Obviously I hope they will, so that there'll be an RDF graph for every XML Protocol message. The Semantic Web can provide an underpinning for the protocols world.
ED: But aren't there areas in Working Groups where they've ignored RDF, when they could have made good use of it?
TBL: There needs to be more coordination there. It's a shame when people almost do RDF and don't quite. An example would have been WSDL. Uche Ogbuji's paper on this is excellent.
ED: I'd like to ask about the Advanced Development side of the new Activity, which aims to involve non-W3C members in the development of the Semantic Web. Isn't this openness unusual?
TBL: We always design the Activity to suit the needs of the community at the time. Examples of infrastructural work in which we did this are the HTTP, URI, and XML Signature work. We wanted the attention of the community experts, and things required wide review. More of our Activities and working groups are moving toward a more public model; XML Protocol is a perfect example.
SW needs to be really open, as many resources for its growth are from the academic world. We need people who may at some point want to give the group the benefit of their experience, without having a permanent relationship with the consortium.
It's not particularly novel. It's combining the RDF Interest Group with W3C internal development stuff. We need to find what the Knowledge Representation community have got that's ripe for standardization, and what it hasn't and so on. Coordination will be very important.