Menu

Building a Worldwide Lexicon

May 10, 2002

Brian Jepson

This article describes a new system for networking dictionaries and translation services on the Web. Think of this as GNUtella for language services. While the system described in this article may appear to be a huge undertaking, it will be built from many relatively simple components that talk to each other via a common client/server interface based on SOAP (Simple Object Access Protocol).

This article is also a call to action. The ultimate goal of the Worldwide Lexicon (WWL) project is to improve multilingual communication, and to make language services easily accessible to a wide range of Internet applications. The potential uses for the system are extensive.

Why should you care? Simply turn on the television or read the news. These are dangerous times, in part because of poor communication between cultures. The most important barriers today are language barriers. The Worldwide Lexicon project has the potential to reduce these barriers somewhat, by enabling people to communicate more effectively in other languages using a variety of tools.

For WWL to succeed requires the talents of many people, both to retrofit existing applications (such as Web dictionary servers or IM/chat clients), and to build new services based on the WWL protocol. This is a challenging project, but also an interesting and potentially valuable one. If after reading this article, you would like to take time out to help, visit the WWL Web site at http://www.worldwidelexicon.org to learn more.

Introduction

The Worldwide Lexicon (www.worldwidelexicon.org) is an initiative to create a peer-to-peer system that allows programs and their users to automatically locate and communicate with dictionaries, encyclopedias, translation servers, and semantic networks throughout the Web.

The Worldwide Lexicon is also inspired by distributed computing projects such as SETI@home, and will allow participating dictionaries and encyclopedias to open their systems to user submissions (more on this in a moment), and to poll a large number of Internet users to submit definitions, score translations, and more.

SETI@home taps idle CPUs to crunch numbers; WWL servers will tap idle Internet users to provide information, review work from other users, and so forth. WWL asks computers to do what they excel at (manage large volumes of information), and asks humans to do what they excel at (infer meaning, describe things, etc.).

The foundation of the Worldwide Lexicon is a simple protocol based on SOAP. The Worldwide Lexicon protocol defines a small set of SOAP methods that creates a common interface to build client and server applications. The protocol provides three basic services:

  • Allows client applications to automatically discover WWL servers by invoking a single SOAP method on a supernode (i.e. find WWL servers that host English-Urdu translations). This enables WWL clients to automatically locate WWL servers based on the desired language and services required.
  • Allows clients to submit queries to WWL-compliant dictionaries, encyclopedias, and semantic network servers (i.e. find synonyms for the English word "orb," find Spanish translations for the word "beach, " etc.).
  • Allows clients to poll WWL servers to fetch requests for user contributions, translations, or peer review. This is one of the most interesting facets of the Worldwide Lexicon, and will be discussed later in this article.

GNUtella for Dictionaries

There are hundreds, perhaps thousands, of dictionaries, encyclopedias and translation servers scattered throughout the Web. They all perform the same basic functions. The problem is that these are mostly homegrown systems. Each dictionary has a slightly different front-end or CGI script. Consequently, all this information is fragmented and bottled up behind proprietary front-ends.

The Worldwide Lexicon solves this problem by creating a simple and easily implemented server-discovery mechanism. One of the methods defined in the WWL protocol (WWLFindServers), allows a client application to contact a supernode to request a list of currently active WWL servers for a specific language or language pair. WWL supernodes, like GNUtella directory servers, simply maintain a list of active sites and what services they provide (which in turn may keep lists of their peers).

Implementing this in client software is easy, and requires only a few lines of code, as illustrated by the following example (written in Visual Basic using the PocketSOAP tool, which I highly recommend for SOAP novices).


set pf = CreateObject("pocketsoap.Factory")

set wwl = pf.CreateProxy(WWLsupernode)

serverlist = wwl.WWLFindServers("english.adolescent","english","","dict")

That's pretty simple. These three generic lines of code allow you to locate WWL servers on the fly (for example, you could use this to build a Web browser plug-in that performs generic dictionary and encyclopedia queries. This example returns a serverlist object that contains a list of servers that match the search criteria, their proxy addresses, etc.

Once you've located a WWL server, the next step is to send a query. This also requires just a few lines of code.

set wwl_server = pf.CreateProxy(serverlist(1).wdsl))

results = wwl_server.WWLTranslate("english.adolescent","english","rad")

This example returns a results object which contains an array of possible translations for the search term.

Congratulations, you can now proceed to decode your teenage daughter's utterances. Or you could just as easily say WWLTranslate("english.british","english.us","cheeky"). Or WWLTranslate("english","espanol","beach"). Or, you could submit a sentence or paragraph to a full-text translation server.

Regardless of whether you want to look up a definition within a language (e.g. look up an encyclopedia entry for The Beach Boys), or translate a word or phrase between languages, or submit a full text to a machine translation server, the procedure is exactly the same. You use WWLFindServers() to hunt for a Worldwide Lexicon server that can handle your request. Then you submit a query using one of three simple functions (WWLSearchText, WWLTranslate, or WWLQuery).

Of course, you'll want to add some extra code to trap errors and to process the returned results. For example, more sophisticated WWL servers will recognize WWLQuery, which allows clients to submit SQL-like queries (for example, to search for synonym for a word that can only be used as a noun). Others will recognize the simpler WWLSearchText and WWLTranslate methods. But even after adding these few extra features, it is still a very simple interface.

Building a Worldwide Dictionary

So how do you go about building a worldwide dictionary/encyclopedia without doing a colossal amount of busy work? A lot of this information is already on the Web. The problem is that each dictionary has its own front-end. So if you can convince many of these dictionary servers to support WWL, you can build a worldwide network of dictionaries without reinventing the wheel.

Upgrading a dictionary server to support the Worldwide Lexicon is nearly as simple as the previous examples. Most dictionary and encyclopedia servers allow users to submit queries via a Web form that in turn invokes a script.

Most programming and scripting languages now support the SOAP interface via easy-to-use toolkits (e.g. SOAP.py, SOAP:Lite for Perl, .Net). So upgrading your dictionary or encyclopedia server to participate in WWL is easy. All you need to do is write a new script that responds to SOAP messages instead of a Web CGI-interface. You've already done the dirty work of building a database and code to query it. All you need to do is add a fancy wrapper.

Simple Read-Only WWL Servers

If you only want to provide read-only access to your server, the job is truly easy. You'll need to write two simple scripts.

  • One script calls out to one or several WWL supernodes, and uses the WWLRegister() and WWLServerStatus() methods to announce its availability. The WWLRegister() method is used to declare which language domains you are hosting. You call this once when you start your server. The WWLServerStatus() method is invoked to notify other servers when you start up, shut down, experience congestion, etc.
  • The second script responds to client requests. This script implements a handful of WWL methods to respond to queries, and replies via SOAP instead of generating an HTML document.

Since you already know how to write CGI scripts, this should be an easy afternoon project for most sites. A basic WWL server only requires five of the methods defined in the Worldwide Lexicon protocol specification.

See Also:

For more information about the WWL protocol specification, visit our Web site at www.worldwidelexicon.org. In addition to the protocol documentation, we'll also post code from dictionaries and encyclopedia servers that have implemented WWL in various languages.

And for the Truly Shiftless...

Some dictionaries and translation servers will be able to join the Worldwide Lexicon by doing absolutely nothing, by simply attracting the attention of people building WWL gateway servers. These gateways present a WWL interface to client applications and supernodes, and translate their requests into other protocols (such as the DICT tcp protocol, proprietary HTTP CGI interfaces, etc). These gateways will even be able to relay requests to live volunteers via instant messaging (more on this in a moment).

Gateways can also be used to convert static wordlists into systems that can be queried. For example, a gateway caches an English-Pashto wordlist, parses the HTML table into columns, and then performs a WWL search request. Essentially it treats the table or tab-delimited text as a multi-column database. This trick will be useful for adding rare languages to the WWL system with minimal effort.

See Also:

Gateway Servers (WWL and Other Dictionary Protocols)

Distributed Human Computing and the Worldwide Lexicon

Just creating a standard procedure for locating and communicating with existing dictionaries is a big improvement because it enables applications to access dictionaries and semantic networks via a standard interface. The process is nearly identical regardless of the language you use.

Now here is where things become really interesting. The second component of the Worldwide Lexicon is a distributed computing experiment (actually distributed human computing is a better term). Instead of tapping idle PCs to crunch numbers, this system will tap idle Internet users to contribute to the dictionaries participating in this network.

It sounds complicated. You're probably thinking that this project will redefine the term bloatware. One of the basic design guidelines used in the Worldwide Lexicon is to keep things as simple as possible. Developers like to build useful applications, not endlessly debug code. So this is also a straightforward feature to implement.

Lexicon@Home Client App

First, let's look at what need to do to build a Lexicon@Home client application. This program is actually very simple. It does three things:

  • It passively monitors your activity to sense when you appear to be casually surfing the Web (e.g. moving your mouse occasionally but not typing a great deal).
  • When it senses you are idle, and subject to your preferences (e.g. prompt me up to four times per hour), it polls one or more WWL servers to ask if there are jobs enqueued. You decide which WWL servers you want to contribute to and set these preferences in the config screen for the app.
  • If the WWL server has a job enqueued, it replies with a CGI URL. All the client app needs to do is point your browser (or an embedded mini-browser) at this URL. You fill in a short Web form, and you're done.

Sounds easy enough; let's look at a quick example.


Public Function OnClientIdle()



	set pf = CreateObject("pocketsoap.Factory")

	set wwl = pf.CreateProxy(WWLserver_uri)

	job = wwl.WWLRequest()

	

	if job.id > 0 then

		action = JobPendingDialog(job.message)

		

		if action = "ok" then

			job_handle = wwl.WWLFetch(job.id)

			LaunchMiniBrowser(job_handle.url)



		else

			WWLReject(job.id)



		end if



	end if



End Function

As you can see, this is pretty simple. Of course, you'd probably want to add some other bells and whistles, like the ability to poll more than one WWL server for pending jobs (maybe you're trilingual and glad to translate English-German or English-Arabic terms).

Even with embellishments, this is still a pretty straightforward application to build since its behavior is simple, and it does not require a complicated user interface (data entry to the WWL server is done through a Web form served by the target WWL site).

This can also be embedded in or bundled with another widely deployed piece of software. Instant messaging clients are a good example. Millions of people use them on a daily basis. Users typically run IM software whenever they are online. These programs also include presence awareness features (Yahoo's IM client, for example, senses when you are probably away from your PC and notifies other users of this).

If hooks to WWL were embedded in widely deployed client programs such as IM clients, smart cursors, etc., the system could reach a large user population relatively easily; we're talking millions of part-time users. (Hint to any readers who work for companies that produce such software).

So now we've demonstrated that building a Lexicon@Home client app is not a big deal. The server side of this equation must be a nightmare, or there has to be a catch somewhere, right?

Nope. If you can write CGI scripts that read and write to a database (you already did that when you built your dictionary server, right?), you know how to update your WWL server so that it can accept user contributions. There are several ways to do this. Next, we'll consider a couple of examples.

See Also:

Public WWL Dictionary With Editorial Review

Let's suppose you like the idea of allowing the general public to add to your dictionary, yet you want to retain control over new submissions. Implementing this capability is also fairly easy to do.

To do this you'll need to write one SOAP script, and two conventional CGI/ASP scripts.

The SOAP script responds to the WWLRequest() and WWLFetch() methods. When a client invokes the WWLRequest() method, you reply with a message that includes:

  • A transaction number unique to that job
  • A short text message (e.g. "translate: english->spanish : bounce")

The client invokes the WWLFetch() method if the user clicks the OK button in the popup dialog box prompting him to process this request. The SOAP script responds to WWLFetch() with a CGI URL that points to the CGI script (e.g. http://www.yourserver.com/cgi-bin/wwl-post.pl?jobid=90212&randomkey=4450192353.

The first CGI script generates a data entry form that prompts the user to provide the requested information. Remember the Lexicon@Home client app is a dumb program. All it does is fetch jobs from your server, and point a browser to a target URL.

You decide what data entry fields to present. So this form may ask the user to fully conjugate a verb, or it may simply ask for a single text entry. You decide what works best for your user community, and what works within the constraints of your existing internal database. This CGI script stores the posted data in your database, but flags newly added records so that they do not appear in your live system.

The second CGI script is a private script that allows your editors or trusted users to pull up a list of recent contributions, and to accept or reject contributions (e.g. display 20 entries per page, with accept/reject checkboxes next to each).

So once again, this is not a colossal task. Of course, you'll need to consider some additional issues, including:

  • Figuring out the best way to incorporate user submissions into your database.
  • Assigning a time limit to jobs so that if a client doesn't respond to a job, the system adds it back to the queue.
  • Blocking or ignoring repetitive WWLRequest() calls, either due to bad client software or malicious users.
  • Blocking or ignoring submissions from IP addresses known for low quality or bogus contributions.
  • Providing private CGI scripts that allow editors to access internal data, consolidate entries (i.e. link entries for different forms of the same word), etc.
  • Allowing users to flag entries for case by case editorial review (optional).
  • Replacing a Boolean accept/reject option with a more flexible scoring system (optional, if you have many editors).

So that's easy enough to accomplish, and for most dictionaries this is probably all you need to do. Now for the really fun stuff.

Automated WWL Dictionary With User Peer Review

Now let's consider the task of building a Worldwide Lexicon server that not only allows users to contribute new listings, but also automates the process of screening and ranking new submissions. Let's imagine for the sake of this example that you are hosting a multilingual dictionary for slang and sexual terminology (perfect for use in chat applications). Human editors would be overwhelmed by the onslaught of naughty words, so you need to create an automated peer review process to reduce their workload.

The process of building this type of server is very similar to the editor-controlled example we just described. The primary difference is that this system will also prompt end users to score contributions from other others. The behavior of the Lexicon@Home client app is identical. It uses WWLRequest() and WWLFetch() to poll your server for job information and target CGI URLs that point to data entry forms.

The server application that processes the WWLRequest() and WWLFetch () methods behaves slightly differently. It will ask some users to enter data (e.g. "translate english-spanish 'woody'"). It will ask others to score recent submissions.

Thus some users will see a Web data entry form in their client application, and others will see a form that prompts them to score or comment on a recent entry. Ideally, you will collect numerous votes for each entry so that the resulting average score is a reliable indicator of definition or translation quality.

(NOTE: It is important to dispatch requests to score entries to randomly chosen users -- this will make it harder for hostile contributors to play games with the scoring system).

This process does not need to be entirely automated. You can still allow human editors to intervene. Human editors can focus their time on entries that have ambiguous scores. For example, an entry like "'soccer' ~ 'i will not accept this tobacconist, it is broken'" will probably receive a very low score, and can be automatically filtered.

Likewise, an obviously accurate translation will receive high scores, and can be accepted without being held for editorial review. You can design your private editor's CGI script to display entries that are neither good nor bad. This will make optimal use of your limited time, without delaying most user contributions.

See Also:

WWL Dictionary With Real-Time Human Assistance (Jabber and SMS)

Let's suppose a user requests a definition for a word that is not in your WWL dictionary. Instead of replying with a record Not Found error, you can relay this query to human volunteers who are logged into a Jabber server (which in turn is linked to a Jabber-WWL gateway server).

These volunteers receive an instant message from the gateway server. If you were monitoring the chat conversion or IRC channel, you might see something like this:


wwlsrv9102: trans de-->en schadenfreude

wilson: ?

robert: ?

brian: dnd

vera: secret malicious pleasure

The gateway server listens for replies to its outgoing message. The volunteers simply reply to the message like another instant message. The gateway server uses the replies to generate a response which is sent back to the WWL client via the SOAP interface.

What's especially nifty is the WWL client application does not need to do anything special. It simply invokes a WWLTranslate() method. In many cases, it may not know the response came from a human volunteer who was conferenced in via Jabber to provide a quick translation or definition (the only difference between this and a purely automated query is that there many be a time delay in the response).

This capability enables WWL server owners to do some really interesting things. First, it allows the dictionaries to adapt to the needs of real users since it learns definitions and translations for queries submitted by WWL clients. Second, it creates the illusion that the dictionary is larger than it actually is. As long as there are volunteers logged in to accept real-time queries, the system will be able to broker on-the-fly translations and definitions for queries that have not yet been catalogued (or at least try to find an answer).

This will become especially interesting if developers create Jabber clients that are aware of the WWL system, and that can automatically locate WWL-Jabber gateways. For example, a multilingual chat client, upon startup, might ask you if you are willing to provide real-time translations for other users. If you answer yes, the client uses WWLFindServers() to locate a real-time WWL server for your language(s), and signs into it. If such a feature became common, this would create a large pool of human volunteers who could provide on-demand word, phrase, and full-text translations.

NOTE: This will also be easy to implement via SMS (short messaging service), although the roundtrip time may not be fast enough for real-time queries (depends on message delivery times). SMS is interesting because it will enable WWL servers to tap not just PC users to contribute definitions, but also millions of cellular phone users worldwide. This will be especially useful in Asia and Europe where wireless communication is more prevalent than landline communication.

See Also:

Worldwide Lexicon Applications

One of the most obvious applications for the Worldwide Lexicon is a Web browser or text editor plug-in that allows users to fetch definitions for words and phrases on the fly. This isn't really a new application, these types of plug-ins have been available for a while. The problem is that most dictionaries do not use a standard format, so each plug-in is usually tied to a specific dictionary.

The browser or text editor plug-in is useful, but it's also pretty simple to build (see earlier example), and not particularly challenging. What is interesting about WWL, especially the distributed human computing aspect of it, is that it can be used to create and maintain dictionaries with continually evolving vocabularies, and even to process documents. It also creates a dictionary and translation API that can be used in a wide range of applications (with minimal cost and effort).

Some of the most interesting applications for the Worldwide Lexicon blur the line between human and machine. The WWL system uses computers where they excel (memorizing and searching large amounts of information), and people where they excel (inferring meaning, understanding metaphors, etc.).

The WWL distributed computing project is, in a sense, a cyborg project. The computers enable WWL clients to locate and query a WWL server in a fraction of a second, yet they also create an efficient human-machine interface that taps human capabilities when they are needed.

News and Document Translation

Imagine being able to read good quality translations for popular news sources, magazines, and short stories. These translations would not be produced by an automated program, but rather by bilingual contributors throughout the world.

The Lexicon@Home client described earlier can just as easily be used to prompt users to translate full sentences, paragraphs, or short documents. Users could volunteer to translate not only words, but also chunks of text from news sites, online magazines, and other sources.

Each volunteer translates a small block of text, perhaps just a few sentences. This process, called segmentation, is often used in document translation. The new twist here is the use of a very large and open user community. With enough participants to translate and cross-check each other's work, a network of volunteer translators could process a prodigious stream of articles.

As with the earlier example, the key to making such a system work is to take a large task (in this case translating a document, perhaps an essay from a magazine), and divide it into manageable pieces that are then parceled out to many contributors in a convenient and non-obtrusive way. Each contributor is asked to translate a small block of text, something a proficient bilingual or multilingual user will be able to do with minimal effort.

These block translations are stitched together to form complete documents, and are then available to any Internet user (most of whom will not be aware the Worldwide Lexicon is involved in their translation). This information can be used to produce a directory of human translated news stories, essays, magazine articles, and other recent publications. Any Web document perceived to have value could be flagged for processing.

Such a system would not translate every document on the Web, nor would it guarantee perfect translations (though they would be superior to automatic translations). It would focus on Web sites or documents that have high current value, such as news stories, important essays, and other sources flagged by users. A user-driven system like Slashdot might play a role in ranking today's most interesting URLs.

Of particular importance is the fact that the output from this application is simple HTML, and will be accessible to any Web browser. Readers will not need to know anything about the Worldwide Lexicon to benefit from this service. They'll simply bookmark a site that serves as a portal to these translated articles and sites.

In addition, this application could catalog a large amount of data, specifically word and phrase translations, that can be indexed and shared with other WWL dictionary servers. So not only will a text server process full-texts, it can also catalog translations for words, phrases, and sentence fragments that can then be replicated to other WWL servers.

So you're probably thinking that this is a pretty complicated project, yet it's also easy to build, primarily because the most difficult computation (translation) is handed off to a general purpose saltwater computer (person). A system like this can be built with a collection of text processing and data entry scripts that each perform specific tasks (and communicate with each other via a shared internal database).

Twext (http://twext.cc) offers a glimpse of what such a system might look like. Twext is a prototype for a system that would translate lyrics for songs via a similar segmentation process.

See Also:

Multilingual Chat

Multilingual chat is another interesting (and challenging) application for the Worldwide Lexicon. Imagine being able to use a chat program that assists you in composing messages in another language.

The chat program could do this in several ways: 1) it could use WWL to locate and query full-text translation servers (to request automatic machine translations for your incoming or outgoing texts); 2) it could use dictionary tools to assist you in composing messages in another language; and/or 3) it could use WWL-IM gateway servers to request other live users to translate a word or text.

Machine Translation

Thanks to WWL, adding machine translation to a chat program is very easy. The chat program simply invokes WWLFindServers() to fetch a list of active MT servers for a language pair, and then invokes WWLTranslate() to ask one of these servers to translate the message text.

The chat program does not need to know anything about how the MT server processes and translates text. (For more info see Using WWL To Talk To Machine Translation Servers.)

This will be very easy to implement. The only caveat is that machine translation often misinterprets words, especially slang words and metaphors. This approach should be complemented by dictionary queries, and, if needed, by queries to live agents.

NOTE: Even if automatic translations occasionally produce inaccurate results, chat is an interactive medium. Users can simply try sending the message again with different words.

Inline Dictionaries and Translation Aids

A user who understands a foreign language, but has a weak vocabulary, may use the chat client in a different mode. Instead of asking the chat client to relay messages to a full-text translation server, the program uses word/phrase dictionaries to guide the user in composing messages in another language.

What it would do is monitor you as you type. Each time you type a word or phrase that it does not know the chat client would query a WWL server to lookup possible translations for the entry. If there was a direct translation, the chat client would insert this automatically. If the word had many uses or meanings, it would force you to clarify or disambiguate your statement via a dialog box, extra keystroke, etc.

While this would be very tedious for document translation, chat is a real-time medium. Users have already adapted to chat systems by creating their own grammar and vocabularies. A cleverly designed WWL chat client would require the user to do some extra work, but not much. Users would learn to use words that have specific meanings, and to use simple word order.

While such a system would not produce perfect translations, it might suffice for informal real-time communication between users who speak different languages. It would certainly be a useful tool for people who know a language, but have a poor vocabulary. For example, I studied Latin and Spanish in school, so I have a general understanding of several European languages. My vocabulary, however, is terrible. A tool like this would allow me to communicate more effectively, and also help me to improve my vocabulary.

See Also:

Other Applications

These are just two examples. What is interesting about WWL is that it will enable developers to embed dictionary and semantic net features in any program that can invoke SOAP methods.

Other ideas -- programs that automatically generate glossaries for Web sites (usually a tedious chore for Webmasters); smarter email filtering software; or even a perverted version of Microsoft's infernal paper clip that informs its user of the sexual or scatological connotations of seemingly innocent words.

In short, any application that could benefit by being able to query a dictionary or semantic net could use WWL.

This Is All Bullshit, It Will Never Work

When people first learn of this system, the initial response is usually TMMP (too many moving parts). At first glance, it does seem like a complicated system, one that is perhaps destined to collapse under its own weight.

While the system as a whole can be used to accomplish some nifty tricks, it is composed of a collection of simple elements. These elements each perform a specific task, are easy to code, and do not need to know a great deal about other components of the system. This enables developers or information providers to focus on a specific aspect of the system without worrying about what everybody else is up to. Some examples:

  • Supernodes (directory servers): These are the servers your client app talks to when it needs to find an english.adolescent --> english.oldfart dictionary. All they do is match clients up with active WWL servers that can field their query. They don't process dictionary searches. They don't process user contributions. They just say, "You're looking for an English-Urdu server? Here you go. Now move along."
  • Read-only dictionary servers: These WWL servers allow clients to perform queries, but do not accept contributions from users. They implement some methods defined in the WWL protocol, but not others. Because they do not accept user contributions, they don't need to know anything about the post procedures, the Lexicon@Home client, etc. They just accept lookup queries, and reply with WWL compliant results.
  • Gateway servers: These servers simply translate incoming SOAP/WWL requests into other protocols (e.g. DICT, Jabber, proprietary HTTP CGI), and then report the results back via the SOAP/WWL interface.
  • Lexicon@Home client: All this application does is sense when the user is apparently available to do a small amount of work, and to invoke two methods on a WWL server that the user has volunteered to contribute to. If the WWL has some work for the user, it replies with a job ID and a Web URL. The client points the user's Web browser (or its own mini-browser) at this URL. It doesn't know anything about the internal details of how the WWL server assigns jobs to users. It also doesn't know anything about a particular WWL server's data entry procedure. It just points the browser to the URL -- the code for the data entry form is served by the WWL server handling the request.
  • Read/write WWL servers with editor- or user-controlled submissions: Each Worldwide Lexicon server that allows public submissions will probably approach this slightly differently. Some will ask their users to provide detailed information (for example, to fully conjugate verbs), while others will collect simpler entries. The decisions about what data to request, and about how it is stored internally are left to each WWL server owner.
  • Client applications: Most client applications that use the Worldwide Lexicon system do not need to how to do anything besides locate WWL servers (using the WWLFindServers method) and submit queries to them. For example, a Web browser or text editor plug-in that allows its user to fetch definitions for words or phrases doesn't need to know anything about how to post new entries to the system. A fancy client program might want to implement these features.

WWL will work. Whether it will succeed or not is the question. Other open source and peer-to-peer projects started as grassroots efforts have demonstrated that you don't need the backing of a corporate titan, just a community of users committed to a project.

In order to succeed, the project will require the talent and goodwill of many people. Such a system will offer some compelling benefits (especially the full-text translation applications). If you would like to learn more or contribute to the system, visit our site at www.worldwidelexicon.org.

Recommended Reading