Sign In/My Account | View Cart  
advertisement

Article:
 From Wiki to XML, through SGML
Subject: Please, more XML-Wiki articles
Date: 2004-03-08 08:53:27
From: Brian Ewins
Response to: Please, more XML-Wiki articles

Reading that back its not clear what I meant. We treated XHTML as looking like:


[markup unit...]
[text unit...]
[markup unit...]
[text unit...]
(etc)


Each text unit is a DOM "DocumentFragment" which contains at least one non-blank "Text" node at the top level, and whose leading and trailing nodes are non-blank "Text", or "Element"s that contain a non-blank "Text" node at some depth. Markup units do not contain any non-blank "Text" nodes.


The algorithm in the previous message describes how to get this segmentation, which captures all the text from XHTML with as little markup as possible, and you don't need any special handling for block/inline elements to do this.


No Previous Message Previous Message Move up to Parent Message Up Next Message No Next Message


Sponsored By: