Sign In/My Account | View Cart  
advertisement

Article:
 Non-Extractive Parsing for XML
Subject: Wrong approach
Date: 2004-05-28 16:00:36
From: Michael Maron

One simple question: why would anybody want to fight with basic ideas of XML in the first place? As far as C++ and especially C are concerned, the answer is obvious: working with XML in C/C++ world is not particularly convenient, so one can look for simpler alternatives to standard parsers - like one considered in this article.


From the other side, regular expressions actually can provide a reasonable alternative to full-scale parsing. Unfortunately, regexps are not common for C/C++ as well, so this is hardly a way out.


But in Perl and in Java regexps are ceratinly very useful for simple XML manipualtions. For example, some XML converters can produce non-valid XML documents which can be transformed to valid ones using Perl or Java regexps.


Another serious alternative to XML DOM and SAX is JDBC in Java.


Previous Message Previous Message   Next Message Next Message


Titles Only Titles Only Newest First
  • Wrong approach
    2004-05-29 11:43:18 jimmy_z [Reply]

    Thanks for the posting.


    The concept of non-extractive parsing is meant to be a general-purpose alternative to extractive parsing; in other words, it is one layer below reg expression and is not tied to any specific language.

    • C regexps and XML binaries
      2004-05-29 21:28:26 Michael Maron [Reply]

      Just did a Google search on C regexps. Apparently, there are some packages available, for example http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/regexp/ Basically, this resoves the problem of simple regexp-based XML parsing.


      Now as far as binaries are concerned, it is really hard to see any reasonable justifications to bypass the regular XMK approach.


      First way to deal with binaries is to keep them as regular files and refer to them by hyperlinks. Seconldly, images can be kept as blobs in the SQL database and extracted by request.


      Best, Michael

  • C++ / .NET
    2004-05-29 06:39:35 Michael Maron [Reply]

    Sure, one can use C++ to work with XML - in .NET environment.


    Another point is that C++ and even C certainly can be used to generate new XML documents from scratch, for example, by simple sprintf(). As for extraction, no, using string manipulation functions is a really miserable solution.


Sponsored By: