XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

I need to translate text in an XML document from English to Dutch. The XML document in question is used as a language file in vBulletin forum software. The problem is that I have to get the text to be translated out of the XML code and, after translation, back into the code in the right place.

A: This is an interesting question -- although it may seem much more interesting at first glance than it really is.

First, for the uninitiated, vBulletin is an extremely popular package for managing bulletin boards -- forums -- on web sites. (At this writing, the vBulletin site lists over 4,800 vBulletin-powered forums. And those are just the sites that have explicitly requested a listing!) Written in the likewise-popular scripting language PHP, vBulletin uses the open source MySQL database as a storage-and-retrieval back end.

This question is interesting because it asserts that the questioner really does need XML to translate from one language (English) to another (Dutch). It's roughly possible to do this, and I'll discuss how in a moment. The question may be less interesting to you if you know that vBulletin comes with a language translation feature built in; what salvages the question for our purposes, I think, is that this translation feature is XML-based.

vBulletin's Built-In Language Translation Feature

The XML in vBulletin's language management is normally hidden from view, behind a vBulletin "Administrator Control Panel" interface. (It's accessed by way of the Control Panel's Languages & Phrases -> Language Manager menu.) This simplifies the task of making language changes.

Note: Tinkering with vBulletin's language translation features is fraught with potential complication. What follows in this column enormously simplifies the process. For full information, consult the vBulletin manual's section on translation and, especially, the vBulletin.org forum for administrators and "hackers" of vBulletin forums.

One thing to understand from the start is this: what's being translated is not forum user messages, replies, and the like -- the "content" of a forum. What vBulletin can translate is the forum user interface. For instance, a typical forum interface (vBulletin's or otherwise) includes text like the following:

  • First
  • 1 day ago
  • Advanced search
  • Delete thread
  • Edit profile
  • Contact us

Such English-language words and phrases are what vBulletin can translate into other languages, such as Dutch.

The principal mechanism for translation, what's called a "language pack," is an XML document whose structure looks something like this:

<language name="English (US)"
          vbversion="[vBulletin version]" type="custom">
  <settings>
      [various options]
  </settings>
  <phrasetype name="GLOBAL">
    <phrase name="[vBulletin phrase]">[translated value]
    </phrase>
	[etc. - more phrases of this type]
  </phrasetype>
  <phrasetype name="BB Code Tools">
    <phrase name="[vBulletin phrase]">[translated value]
     </phrase>
     [etc. - more phrases of this type]
  </phrasetype>
  <phrasetype name="vBulletin Settings">
    <phrase name="[vBulletin phrase]">[translated value]
    </phrase>
	 [etc. - more phrases of this type]
  </phrasetype>
  <phrasetype name="FAQ Title">
    <phrase name="[vBulletin phrase]">[translated value]
    </phrase>[etc. - more phrases of this type]
  </phrasetype>
  <phrasetype name="FAQ Text">
    <phrase name="[vBulletin phrase]">[translated value]
    </phrase>
	[etc. - more phrases of this type]
  </phrasetype>
  <phrasetype name="Control Panel Global">
    <phrase name="[vBulletin phrase]">[translated value]
    </phrase>
    [etc. - more phrases of this type]
  </phrasetype>
  <phrasetype name="Permissions">
    <phrase name="[vBulletin phrase]">[translated value]
    </phrase>
    [etc. - more phrases of this type]
  </phrasetype>
   [etc. - many more phrase types!]
</language>

Among the options covered by the settings element are the character set (specified by a charset element) and the "thousands separator" for numbers (the thousandsep element). The settings element does not, however, drive any actual translation. All of that power resides in the phrasetype elements and their phrase children.

Each phrasetype element identifies some general component of the user interface; the phrase elements -- specifically, their name attributes -- identify specific English-language phrases to be translated as indicated by the corresponding "translated values." Each phrase's name attribute is an English-language phrase, all lowercase, with underscore characters in place of spaces. For example, one phrasetype element, whose name is "Posting," has among its child phrase elements such names as "additional_options", "after_you_submit_your_message", "attach_files", "close_this_thread", and "delete_message". The English phrase "Attach Files" (which a forum user might see when composing a message for posting) is thus translated using the text content of the phrase element whose name attribute has a value of attach_files. If this text content were, for instance, the string "ABCD", then the phrase "Attach Files" would be replaced with the string "ABCD" wherever it appears in the vBulletin interface in the context of posting a message.

All of which sounds rather straightforward. So what's the big deal about translating the user interface via vBulletin's built-in feature set?

The big deal is in the number of phrases, the sheer breadth of text to be translated. I did see a post about one translation effort under way, which referred to "5000+ phrases." Whatever the exact number of words and phrases to be translated, a typical vBulletin translation must be the work of numerous (mostly volunteer) individuals, fluent in both English and whatever the target language might be, spread over the course of months of work.

There aren't that many language packs available so far. (The language-pack feature has been available only with vBulletin version 3+, which has been out for less than a year. Earlier versions required significant hacking of source code and/or vBulletin templates.) That said, I did find an English-to-Dutch effort already underway, begun in October, 2003; in May, 2004, they released version 1.0 of their Dutch language pack. And that, almost certainly, will be the questioner's best bet.

The Pure-XML Option

I can think of a few cases where the above solution won't help (there may be others, of course):

  • Your version of vBulletin doesn't support the language-pack feature.
  • Your version of vBulletin supports the language-pack feature, but you don't want to (or can't) use it.
  • What you want to translate isn't the user-interface text, but the actual forum message content -- messages and replies.
Note: For information on English-to-Dutch translation in vBulletin versions earlier than 3.0, you might check this thread on the vBulletin.org community forum. The messages on the thread are themselves in Dutch, which I can't read, but the thread was pointed out to someone else who asked a similar question in November, 2001.

In any of these cases, your task is exponentially more complex. In theory, it's possible to do a word-for-word translation from one language to another; in practice, it's almost impossible: how do you handle things like irregular English verbs and other non-standard word forms? If you're translating forum content, how will you deal with misspellings? Do you need to consider "translating" in-word punctuation, such as hyphens and apostrophes, to full words in the target language? And so on.

The better approach -- the only one rooted in sanity -- is to do what vBulletin itself does: translate entire phrases. You might undertake to do so with an enormous XSLT stylesheet, which matches all possible phrases with their translations. (This is somewhat simpler if you're "just" translating user-interface text: the number of phrases may be quite large, but at least it's finite.)

Or you might instead leverage existing work in translating an XML document from one language to another. Many of the issues in doing so are covered in Andrzej Zydron's article here on XML.com, from January, 2004. Pay special attention to his discussion (in part two of the series) of XLIFF (the XML Localization Interchange File Format). Note, for example, the correspondence between XLIFF's source and target elements and the phrase element's name attribute and its text content in vBulletin. Also read and heed what Zydron says about "fuzzy" translation.

In any case, if you can't use vBulletin's built-in translation feature, don't expect that someone will already have a solution prepared for you. Ready yourself for many months of work!

On to Better Things

I've been writing XML.com's "XML Q&A" column for exactly four years now. In that time, the number of XML-relevant newsgroups, web sites, and other resources has exploded. No longer, as a rule, do newcomers to XML want to understand the basics of the standards; they've already got their answers online, or from any of a hundred books. The upshot is that a Q-and-A column has what might be (at best) only limited continued relevance.

Next month I'll be moving, like a hermit crab whose current quarters are starting to feel a bit tight, to the shelter of a new column focused on XML applications.

Also in XML Q&A

Trickledown Namespaces?

From XML to SMIL

From One String to Many

Getting in Touch with XML Contacts

Little Back Corners

Hardcore XML veterans will infer from the word "application" a particular meaning: an XML application in this sense is also known, more commonly and informally, as a vocabulary, a dialect, a flavor -- a conceptual model that might be formalized in a DTD or XML schema. If you're new to XML, on the other hand, when you hear the word "application" you might think simply of software: parsers, editors, XSLT processors, and the like.

In the new column, I will cover both types of application: "vocabularies" -- such as vBulletin's XML-based language-translation described above -- and software that consumes or generates or otherwise uses XML in an interesting or novel way. In either case, my focus will not be on the applications you've probably already heard or read about. Don't expect me to devote much space to XML schema, say, or XSLT, or RSS, or SOAP; nor should you expect to find capsule reviews of packages such as XML Spy, Saxon, RenderX's Xep, or Macromedia's SVG Viewer. Rather, I hope to cast some light into XML's little back corners -- niches whose existence most of you may not have even suspected.

In the meantime, thanks to XML.com's editors for their support for "XML Q&A" (as well as the new column). Thanks, especially, to the people who posted their questions -- sometimes anguished, sometimes bemused -- in the newsgroups I monitored every month. And thanks, finally, to my readers, who every month gave me valuable feedback and the motivation to improve. "Q&A" wouldn't have lasted this long without any of you; I look forward to making your acquaintance all over again in the new column.



1 to 4 of 4
  1. Los Angeles Locksmith 323-678-2704 Los Angeles Locksmith
    2010-06-16 12:33:23 
    Los Angeles Locksmith 323-678-2704 Los Angeles Locksmith
    Los Angeles locksmith services
    All automotive, home and commercial requirements become routine with our range of keys, locks, alarm systems, CCTV, keyless entry systems, video surveillance systems and access control systems. Get your free estimate now and join the growing list of those satisfies with our state of the art affordable services.
    Emergency lockout/opening
    Installation/ repair / change /upgrade any type of lock
    Key replacing
    Key making


    Find our local locksmith
    in your area
    LOS ANGELES SANTA CLARITA SAN FERNANDO BURBANK GLENDALE TORRANCE REDONDO BEACH HERMOSA BEACH GARDENA MANHATTAN BEACH HAWTHORNE INGLEWOOD SANTA MONICA BEVERLY HILLS WEST HOLLYWOOD, ORANGE COUNTY
    Los Angeles locksmith provides services in most of California. Because of the excellent service guaranteed to the local residents, our business has expanded and now provides solutions to householders and business people alike throughout the great state of California. We are available 24hr in the Greater Los Angeles area. Our locksmiths are directed from:
    For our locksmith service call toll free (877) 364-5264 or (213) 804-8726
    LVH SYSTEMS Locksmith, a family owned company brings more then 20 years experience to a diverse residential, business and commercial clientele in the Los Angeles area. LVH SYSTEMS locksmith security technicians are licensed bonded and insured with an average of ten years experience bringing unequivocal excellence to their work. We provide 24 hour 7 days a week service to meet all of your security needs. LVH SYSTEMS Locksmith technician’s vehicles are fully equipped so we can finish the job the same day. When calling for service, you will be speaking to a professional locksmith and not a dispatcher. At LVH SYSTEMS Locksmith we stand by our work and all parts and labor are backed with a full 90-day warranty.
    Our staff is professionally trained and would be happy to assist you with all aspects of your security needs including your home, business and automobiles. Price guide or verbal quotes are available upon request. If you have any questions or require immediate service please contact us.
    LVH Systems Locksmith is the largest independent 24 hour emergency locksmith service in California. All our locksmiths are fully trained and will use NON-DISTRUCTIVE methods to gain you
    entry to your home, business or vehicle. Our locksmiths will be with you in 30-60mins after your call all prices are fixed. no matter what time or day of
    the week you need our services.
    Emergency lockout opening
    Change/install/upgrade any type of lock
    door specialist
    Patio doors
    No call out charge.
    Lowest Prices Guaranteed
    Our skilled locksmith technicians will deliver these services at affordable prices.
    More and more people are placing their trust in LVH Systems Locksmith to meet their security needs.


    Call us Toll Free Phone at 1- 877-364-5264 We’ll be happy to answer all your questions and suggest those solutions that are right for you.



    LVH Systems Locksmith — All your security solutions at prices you can live with.




    "Perfect Lock Pick services - Emergency Mobile Locksmiths Service" Performed by Licensed, Insured & Bonded Professional Locksmiths. by the California BUREAU OF SECURITY AND INVESTIGATIVE SERVICES


    Locks Installation by Our trained professionals locksmiths.


    Medeco Security Locks
    Medeco Locks - Medeco High Security Locks is the market leader in locks and locking systems for security, safety, and control.


    Mortise Locks & Cylindrical Locksets


    Baldwin Locks & Hardware
    Baldwin Locks - For more than 60 years, Baldwin Hardware Corporation has been developing the finest and most complete range of hardware for new and restoration construction.


    Mul-T-lock Locks & Hardware
    Mu-T-lock Locks - Manufacturing and marketing of high security locks products for institutional, commercial, industrial and residential applications.


    Kwikset Locks & Hardware. Manufacturer of Residential Door Locks and Door Hardware
    Kwikset Locks - Kwikset – Manufacturer of residential door locks and door hardware including door knobs, door levers, door handles, deadbolts, handlesets, pocket door.


    Schlage Locks & Hardware
    Schlage locks - Today, Schlage offers home security solutions from a wide selection of mechanical and electronic security locks and accessories in touch with modern look.


    Master Lock®- Tough Under Fire
    Master Lock offers residential, automotive, commercial and locker lock products. Master Lock carries multiple product lines including; Master Lock
    locksmith, emergency locksmith, Los Angeles locksmith, Santa Monica locksmith, Hollywood locksmith, Culver City locksmith, Venice locksmith, Malibu locksmith, Glendale locksmith, Downtown locksmith, Beverly Hills locksmith, Westwood locksmith, car locks, doors locks, home locks, locks install,locks repair, house locks, gate locks, commercial locks, residential locks, lost keys, duplicate keys, open car door
    Call US (877) 364-5264 or (213) 804-8726 (323) 678-2704 (310) 925-1720 (818) 386-1022



  2. #1 Carpet Cleaning & Upholstery Clean Los Angeles 1-323-678-2704
    2009-06-11 14:58:45 
    #1 Carpet Cleaning Los Angeles 
    call 1-323-678-2704 Steam or dry Cleaner - Carpet Steam or dry cleaning, Deep Shampoo Carpet Cleaning, Area Rugs cleaning, Mattress Cleaning, Upholstery furniture Cleaning, curtain cleaning, Green Organic Carpets Cleaning.
    Los Angeles Carpet Cleaning Company Non-toxic Cleaning, Safeclean organic cleaning Los Angeles, California. Carpet cleaning Los Angeles,CA: We provide rug cleaning, upholstery cleaning, wood floor refinishing & maintenance, flood & water damage restoration, air duct cleaning and much more. LA Carpet Cleaning The Main Los Angeles Local Carpet Cleaners
  3. #1 Carpet Cleaning & Upholstery Clean Los Angeles 1-323-678-2704
    2009-06-11 14:58:42 
    #1 Carpet Cleaning Los Angeles 
    call 1-323-678-2704 Steam or dry Cleaner - Carpet Steam or dry cleaning, Deep Shampoo Carpet Cleaning, Area Rugs cleaning, Mattress Cleaning, Upholstery furniture Cleaning, curtain cleaning, Green Organic Carpets Cleaning.
    Los Angeles Carpet Cleaning Company Non-toxic Cleaning, Safeclean organic cleaning Los Angeles, California. Carpet cleaning Los Angeles,CA: We provide rug cleaning, upholstery cleaning, wood floor refinishing & maintenance, flood & water damage restoration, air duct cleaning and much more. LA Carpet Cleaning The Main Los Angeles Local Carpet Cleaners
  4. XML power
    2006-07-21 10:37:02 
    I am a recent convert to the power of XML, but despite my trust in it's ability to make "IT" happen, my own skills and knowledge are failing the power of XML.
    I would like to find a method of importing an XML file into flash but with a language option available to the user.
    Like I said - I'm failing the ability of XML as my knowledge is failing it enourmously. Sorry!
1 to 4 of 4






close