Social Meaning and the Cult of Tim

July 23, 2003

In a previous column, "The Social Meaning of RDF", I described a debate about the relation of formal and social meanings of RDF assertions, particularly as related to the Semantic Web. This debate about the "social meaning of RDF" is complex and encompasses a wide range of thorny issues.

Not only does the debate have many conceptual parts, but it also has many modalities. Last time I discussed it in this column I was focusing on an upcoming technical plenary session at the W3C, where the social meaning issues were on the agenda. The issue before that tech plenary was whether to strike or edit section 4 of the RDF Concepts and Abstract Syntax specification. This way of carrying on the social meaning debate was unlikely to lead to a satisfactory resolution, since it was possible to strike the problematic language without solving or addressing the substantive issues which animate the debate in the first place. Some of those issues include the following:

how or whether the meaning of URIs (when used, for example, as RDF predicates) is defined and defined authoritatively;
how one asserts, or refrains from asserting, RDF statements;
how one specifies the meaning of an RDF graph, which presupposes some position on the relation of an RDF graph's formal meaning, social meaning, and social meaning of its "formal entailments";
whether the social meaning of RDF assertions is a function of the intention of the speaker ("speaker meaning") or is a function of the meaning of the assertion itself ("sentence meaning");
whether "publishing" RDF is sufficient to obligate someone (anyone?) to its formal or social meanings; if so, who does it obligate and what sort of obligation is it; and, further, what acts or ommissions constitute a party as the "publisher" or as the "asserter" of some RDF;
what relation there is between RDF's meaning, whether social or formal, and legal contexts and considerations.

In other words, all the really easy stuff....

The participants of the plenary session reached a broad consensus, which consisted of four points, two of which are especially relevant here: first, that section 4 of the Concepts document would be struck; second, that the Semantic Web Coordination Group (SWCG) would "prioritize work on this issue, coordinated with the TAG over URI denotation". As has become clear, however, there seems to have been some ambiguity about the substance of this consensus, particularly related to the role of the SWCG, about which more below.

Apparently in response to the tech plenary and to SWCG discussions, Tim Berners-Lee recently proposed the social meaning cluster as a new TAG issue. Dan Connolly says that the SWCG kicked the issue up to the TAG because it "didn't see a way to specify how this works for RDF without specifying how it works for the rest of the Web at the same time".

There have been at least two different kinds of objections to Berners-Lee's request that the TAG take on the social meaning issue (though, to be fair, I'm not entirely sure what precisely Berners-Lee intended the TAG to consider, given that his message is confusing and seems to have been written haphazardly.) The first kind is procedural: that Berners-Lee's actions are not fully responsive to the tech plenary consensus. The second kind is substantive: that Berners-Lee's statement of the social meaning problem in his message to the TAG is simply wrong and misleading .

Substantive Questions

Pat Hayes -- who, it must be pointed out, is an important player not only in the Semantic Web effort but also in the recent history of knowledge representation and artificial intelligence research -- voiced the most sustained substantive criticism of Berners-Lee's presentation of the social meaning issue. In Hayes' estimation, Berners-Lee's presentation of the social meaning issue contained several falsehoods: "To make authoritative assertions of propositions which are clearly or provably false", Hayes claimed, "does not make them true; it only destroys public trust in the source making the silly assertions".

What are some of these falsehoods, according to Hayes? First, Berners-Lee says that "each URI identify [sic] one thing ("Resource": concept, etc)". But if, as Hayes points out, "identify" means reference or denotation, then "it is simply untenable to claim that all names identify one thing."

Berners-Lee also claims that his understanding of the Semantic Web architecture "allows information to be published so that the recipient of an RDF statement 'S P O' [i.e., a subject, predicate, object RDF triple] can, by dereferencing P, get information about the relation being asserted." Hayes's response is worth quoting in full:

Wrong. First, there is no particular semantic importance attached to the P part of the triple. Properties have no special status in RDF. Second, the relation is not being asserted: the triple is. Third, there is no particular reason why dereferencing P will get you to the information you might require in order to draw the appropriate conclusions; and indeed most RDF applications would not work if this were an architectural requirement. Finally, this conclusion does not follow from the architectural points made previously.

Hayes makes other substanitive criticisms of Berners-Lee's position, and I suggest that interested parties study his message (and the entire thread in which it occurs) carefully. Hayes finished his criticism of Berners-Lee by striking a note to which I will return below, one about the clash of professional competencies, expectations, and standards:

Maybe, if I could make the suggestion without seeming to commit lese-majesty, it would be a good strategy for the W3C, rather than trying to render nonsense "in terms that the ontology community will understand," to ask if it might possibly learn something from actually listening to the ontology community; or at any rate, to anyone with a grasp of basic 20th-century results in linguistic semantics.

In response to Hayes's overarching claim about substantive falsehoods, Berner's Lee said that he was "using English, not model theory. You [i.e., Hayes] use model theory words, but it may be that model theory can't express the English talking about for example the real world." In response to Hayes's claim that there is ambiguity (and falsity all possible readings of that ambiguity) in the claim that "each URI identify [sic] one thing", Berners-Lee said that "... I am describing, if you like, a perfect platonic design, to which we can aspire, though social and engineering factors limit our ability to implement it perfectly ... One deals with deviations from the perfect in a form of perturbation theory." And in response to Hayes's claim that the predicate element of an RDF triple is not semantically special, Berners-Lee retorts that "p is associated with the relation. I had understood that the semantics of s p o were the relation R(s,o) where R is identified by p. Did I get it the wrong way around?"

In general Berners-Lee's responses suggest, when they do not outright claim, that Hayes is simply unfamiliar with the merits of engineering fixes to the conceptual problems he identifies. Berners-Lee suggests this in a series of statements, which to my ear have an unhelpfully arrogant tone, in his response to Hayes:

We will always be a challenge for those of you who make these precise theories.

It is maybe from working with these [operators like multiplication and addition], and with the well-known and quite non-URI-like properties of natural language words, that you may have become blind to the advantages of an architecture where we say "This system is different from natural language: we design it such that each URI identifes (doenotes?) [sic] one and only one concrete thing in the real world or one and only one globally shared concept".

...we are not analyzing a world, we are building it. We are not experimental philosophers, we are philosophical engineers. We declare "this is the protocol". When people break the protocol, we lament, sue, and so on. But they tend to stick to it because we show that the system has very interesting and useful properties (emphasis added).

The architecture...defines an "authoritative" or "definitive" meaning, to which "meaning" in wittgensteinian sense and "intended menaing" in [an] ethical or legal sense generally approach as closely as they can, and close enough for the system to work and be unbelievably useful to millions of people.

We are building a new system. We can design it differently from existing linguistic systems. Toto, we are not in Kansas any more.

Hayes's response to Berners-Lee's response clarifies some of the disputed issues. As to Berners-Lee's equivocation that URIs "uniquely identifying one thing" is an ideal which can only be imperfectly approximated in actual systems, Hayes's response is very blunt:

I'm not saying that the "unique identification" condition is an unattainable ideal: I'm saying that it doesn't make sense, that it isn't true, and that it could not possibly be true. I'm saying that it is crazy.

Procedural Questions

In addition to these and other substantive issues, there are procedural problems floating around. I have concluded, after studying the relevant documents and speaking with participants, that the consensus reached at the tech plenary was underspecified. There are at least three procedural issues: first, does Berners-Lee's making a request to the TAG to consider the social meaning issue accurately reflect the tech plenary consensus (this is distinct from the substantive issue of whether the content of Berners-Lee's message is either adequate or fair to those who do not share his position); second, is the TAG really the appropriate setting in which to resolve this issue; third, is there something broken about the W3C's institutional or moral authority?

The Tech Plenary Consensus

Part of the tech plenary's consensus was to ask the Semantic Web Coordination Group to do something about the social meaning issue. It isn't clear what the SWCG was supposed to do, precisely. Some of the interested parties think that the SWCG was supposed to determine the priority of this issue, others that it was supposed to work on this issue itself, giving it a high priority. Yet others think that the SWCG was supposed to give a high priority to forming some kind of group to determine whether and what could be actually said about the issue that could get consensus.

Berners-Lee's request to the TAG is that "a draft finding be written which pulls this together [i.e., presumably, his view of what needs to be done about the social meaning issue], with elaborations pointing into the various specs. Members of the SWCG have volunteered and some members of the SWCG have been volunteered to read early versions." Note that he doesn't say who or what might author this draft finding. And, as we'll see, using Berners-Lee's disputed reading of these issues as the basis for a draft finding is problematic.

Is the TAG Appropriate?

I'm not sure the TAG is the right venue or context within which to decide this issue. There is, however, at least one reason for the TAG having some hand in the resolution, namely, that the SWCG claims that the problems of URIs and resources are not specific to RDF, but touch the existing Web. That sort of large architectural issue -- though it's not clear that this is an architectural issue as much as it is a problem of formal specification -- is certainly within the TAG's purview.

There are at least two reasons why the TAG may not be the ideal place to resolve the social meaning debate. First, most, if not all of the TAG members are software engineers of one kind or another. The debate about social meaning is in some ways a debate between at least two distinct kinds of computer professional: the software engineers (who make a legitimate claim to have built the Web in the first place) and the knowledge representation theorists (who make a legitimate claim to have built many knowledge representation systems of a kind analogous to the one which the Semantic Web is meant to become.) Not only do these groups have different methods, backgrounds, and modes of argument and discourse, but they also have divergent expectations and standards about formalisms, formal systems, and the like.

Giving the issue over to the TAG to resolve entirely seems like a bit of procedural game-rigging, whether intentional or not. Of course someone will object at this point that the TAG is open to input from people other than its members. That's true, but that fact may not be enough to ensure that the process of resolving this debate is substantially open and fair and, as importantly, is seen to be substantially open and fair.

Sandro Hawke makes something like this procedural point when he says,

There has been a process question here, which we might express as a continuum. At one end, the W3C could convene a series of Semantic Web Design Workshops to try to come up a viable design for the Semantic Web. At the other end, it could go with [Berners-Lee's] intuition that the Semantic Web is deeply coherent with the existing Web, and that URIs have always been logical constant symbols. What [Berners-Lee is] asking the TAG to do here, I think, is to use his sketch as the initial straw man. Some of the motivation here is surely the fear that the other end of the spectrum is a very, very slow road. More a bog, really.

In this regard, Hawke makes an able stand-in for the concerns of the software engineers, namely, that KR theorists can formalize the process to death. But surely at the other end of this continuum there is also a danger? The KR theorists are likely to respond that, given the deep flaws in Berners-Lee's "initial straw man" proposal, that the process of resolving the issue will be poisoned from the start. Or, put another way, since the Semantic Web is really meant to be a distributed, decentralized knowledge representation system, built a top the existing Web -- itself a distributed, decentralized hypermedia system -- it simply is the case that conceptual muddles, of the sort KR theorists like Pat Hayes claim Berners-Lee has fallen into, are serious and practical obstacles to realizing the Semantic Web.

There is an internally powerful bit of W3C institutional lore which is relevant here and surely serves as motivation, not only for Berners-Lee but for those who defend his "intuition." This is that hypertext and hypermedia theorists were so hung up on building perfect systems that they couldn't get something like "the Web" as we know it today built; the flip side of this legendary tale is that it took a practical, hard-minded software engineer like Berners-Lee to come along and build a practical, workable system like the Web. Hawke refers directly to this story as the "web's founding mythos," one which serves as a powerful internal motivation for W3C members and proponents.

The Cult of Tim

Weaving throughout this tangle of competing claims and understandings is the social fact of Berners-Lee's institutional and moral authority. For the record, I do not dispute that Berners-Lee deserves some measure of moral authority, owing to his role in bringing the Web to life. The question, however, is how far that moral authority should be allowed to extend and whether it implies any other special abilities or capacities.

Perhaps the most disturbing aspect of the entire social meaning debate is the degree to which people uncritically defer to Berners-Lee's "intuition" and "vision", that is, to his admittedly incompletely expressed idea about the Semantic Web. Few people think that Berners-Lee's ideas about the Semantic Web are perfectly or completely formed. Everyone, including Berners-Lee himself, agrees that they are intuitions, which implies the idea that he can see further than he can say, that he can reach further than he can grasp, at least for now.

One obvious point to make is that there are a lot of people trying to help Berners-Lee realize his intuited vision and that he wields more influence and authority over this complex process than any other single person. Perhaps that is perfectly appropriate. However, the problem arises when other people, who have less moral authority, disagree with Berners-Lee. I have heard it said several times, although few people seem willing to commit to this view publicly, that Berners-Lee should be exempt from public criticism because the realizability of the Semantic Web rests upon Berners-Lee's reputation more than upon any other single factor.

Also in XML-Deviant

The More Things Change

That viewpoint not only cedes to legitimate moral authority far, far too much ground, but it also threatens the credibility of the entire Semantic Web effort. No one should be exempt from public criticism. Let me say that again, clearly: neither Tim Berners-Lee nor anyone else should be exempted from fair, balanced, charitable public criticism. Having created the Web does not in itself mean that Berners-Lee is right about every issue, as he would freely admit. Nor does it mean that people who disagree should "defer to the vision" of Berners-Lee. Too many W3C groupies, hangers-on, associates, employees, and peripheral figures act as if Berners-Lee's "vision" is infallible or incorrigible. I have heard W3C people react harshly to criticism of Berners-Lee on precisely these terms. "After all", they suggest, "he did invent the Web". Playing on that bit of institutional lore aforementioned, the idea seems to be that "Tim did it once, only Tim can do it again".

I'm not going to spend much time here suggesting why that idea is flawed. I will suggest, however, that the existing Web is an exceedingly complex entity and that its realization required the work of hundreds of very smart people. Nothing as complex as the Web is ever the result of one person's work alone. The Semantic Web will be, if it can be achieved at all, significantly more complex than the existing Web. It will no more be the result of one person's efforts than the Web is. We do ourselves, the Semantic Web effort, and Tim Berners-Lee a real disservice insofar as we contravene these basic facts.