Sign In/My Account | View Cart  
advertisement

Article:
 Parsing RSS At All Costs
Subject: Agreed. So what's your solution?
Date: 2003-01-22 20:27:27
From: Dare Obasanjo
Response to: Agreed. So what's your solution?

It depends on what you consider to be the problem. From my perspective, the problem is websites that provide non-standards compliant XML in their RSS feeds while from yours it is consuming this XML even if it does not comply with the W3C XML 1.0 recommendation.


The solutions from my point of view would rely on pressuring sites and tools that produce invalid RSS feeds to correct them and creating tools like the RSS validator produced by yourself and Sam Ruby (which is an excellent contribution to the community).


The temporary benefit of being able to read ill-formed RSS feeds is outweighed by the harm caused to XML and the Web by fostering the idea that it is OK to produce and consume XML that does not conform to W3C standards. XML has been successful thus far because of the fairly strict adherence to standards by vendors, producers and consumers of XML documents. It is unfortunate that your article is attempting to undermine this even though your intentions are good.


No Previous Message Previous Message Move up to Parent Message Up Next Message Next Message


Titles Only Full Threads Newest First
  • Benefits and harms are not evenly distributed
    2003-01-22 21:24:36 Mark Pilgrim [Reply]

    re: "The temporary benefit of being able to read ill-formed RSS feeds is outweighed by the harm caused to XML and the Web"


    The problem is that the benefit is accrued by the software vendor, and is direct and immediate, but the harm is caused to everyone equally, and is long-term and abstract. Direct and immediate wins every time.

  • Agreed. So what's your solution?
    2003-01-22 20:47:33 Max Daymon [Reply]

    Build in functionality to report back to feeds providing garbage data. Make it easy to report to sites that their feeds are causing a problem.


    The path of silently dealing with garbage data leads to excessive amounts of development time being spent on a problem which should take virtually no time. Further, it reflects poorly on the aggregator when it does run into a feed it can't deal with. Instead of blaming the feed, users now blame the tool for not handling it.


    If I can't reasonably rely on RSS being well formed and complying to an industry standard specification, I'm more inclined to simply remove the functionality than to enter an endless back and forth battle of regular expressions and garbage data.


    Put a fence at the top of the cliff, not an ambulance force at the bottom. Tools which generate problems will eventually fall from favor. All things considered, 10% failure for such a technology seems promising. There was a time when it was hardly possible to find ANY well formed web pages.



Sponsored By: