Sign In/My Account | View Cart  
advertisement

Article:
 Introducing PyRXP
Subject: PyRXP XML non-conformance
Date: 2004-06-24 15:30:16
From: PaulMayer
Response to: PyRXP XML non-conformance

I use PyRXP extensively so my problem set
does not contain ANY unicode and "conformance"
hence is a non-issue.Memory footprint and speed in this case DO however matter, as I am dealing with processing roughly 50,000 files with on-disk
sizes of up 450 MB for the largest file
containing roughly 900,000 tags.

No Previous Message Previous Message Move up to Parent Message Up Next Message No Next Message


Titles Only Full Threads Newest First
  • PyRXP XML non-conformance
    2004-06-29 14:47:16 Uche Ogbuji [Reply]

    As I pointed out in the article, if you are not using XML, then you don't need to deal with all the mess of angle brackets in the first place.


    I can easily process 450MB and even 900MB files 10 times faster than PyRXP can by writing usually a dozen lines of Python. When I want to actually use XML (which is defined by the W3C spec whether you like it or not) things become vastly more complex, and thus much slower.


    If all you're using is plain text, then just use plain text. If, however, you're using XML, which *is* Unicode (it is meaningless to say "there is no Unicode in my XML data), then use an XML tool. My main point in this article is that PyRXP is not an XML tool. Full stop.


    This is really not that subtle a point.


    Why peoeple insist on mucking around with pointy brackets and attributes awhen all they need is plain text CSV or INI-like format is beyond me, nor is it a question that interests me.


    --Uche



Sponsored By: