Last night I stated my position on well-formed Atom feeds, but I'm not sure I did a good job of explaining myself.
The point here isn't that I'm trying to save myself time by requiring well-formed Atom feeds - I've already written a forgiving RSS parser which could easily be adapted to handle Atom, so this will actually take me more development time. And while I agree 100% with the position that users don't care about well-formed XML, the point is that parsing malformed feeds will require users to care later on.
With both HomeSite and TopStyle, I've seen first-hand the problems that customers deal with due to the way different browsers handle HTML and CSS. If you're a web author, no doubt you've wasted countless hours trying to get your pages to look right in multiple browsers. Instead of pulling your hair out over cross-browser inconsistencies, wouldn't you rather have spent your time creating a great site?
So it is with Atom. If you have a blog (and if you don't, chances are you will in the next year or two), your focus is on writing interesting content. Now, what would you do if your newsfeed didn't work right in different aggregators? Perhaps one aggregator showed the right content, but your spacing is all screwed up. Or another got your spacing right, but has HTML entities such as &
and <
spewed all over the place. And another aggregator shows your post just fine, but the link to your site doesn't work. This sort of thing already happens with RSS, and it's frustrating to blog readers and writers alike.
Would you really want to devote time to tracking down these problems? Web authors have to deal with this situation today, and it's a royal pain (in fact, I created TopStyle to help web authors solve these problems). Wouldn't it be nice if bloggers didn't have to face these same issues down the road?
Now, keep in mind that the vast majority of Atom feeds will be produced by blogging tools rather than hand-coded. By requiring well-formed Atom feeds, FeedDemon and NetNewsWire give the blogging tool creators a very strong incentive to create well-formed feeds - they don't want to deal with customer complaints about their feeds not working in aggregators. By taking this position now - when Atom is in its infancy - we can avoid the problems that have plagued web authoring, and enable both content creators and end users to focus on great writing.
Update: I've compromised on the above position.
I agree with you. XML Should be well formed. You had to do what you had to do with RSS but somewhere is has to stop.
My question is, is there a RSS XML validation program?
Also I have a number of clients that send out news letters. Are there tools to create and validate RSS and/or Atom?
BTW, I really like Feed Demon. Of course I use TopStyle and Homesite/CF Studio.
What's Next ??
Posted by: Bartee Lamar | Tuesday, January 13, 2004 at 08:03 AM
How about not being so forgiving about RSS?
I was trying to tighten up RSS, and everyone got mad at me, somehow I was interfering with their right to be expressive.
So now most people see it the same way, let's start getting more picky about what we are willing to accept as RSS.
I'll work with you on this, if you like. There is a way to do it.
Posted by: Dave Winer | Tuesday, January 13, 2004 at 08:23 AM
Thanks for commenting here, Dave. It would be hard for any aggregator to fail to parse feeds that it previously handled just fine, but perhaps there's another way to handle this? Bill Kearney once suggested that FeedDemon show a different icon for invalid feeds, and this sounds like a good way to do it. This way users could still read the content, but they (and, perhaps more importantly, the feed authors) would have a visual cue that the feed has problems. Perhaps this could provide just enough incentive for feed authors to create valid feeds without making it a problem for end users?
Posted by: Nick Bradbury | Tuesday, January 13, 2004 at 10:07 AM
http://feedvalidator.org/ will validate RSS and Atom feeds. It is a free service, and the code is open source.
+1 on visual cue of invalid feeds. Maybe an icon, click for more info, with a link to validate the feed at the feed validator, and maybe contact info culled from the feed itself, if any. I suggested this a year and a half ago: http://diveintomark.org/archives/2002/08/20/how_liberal_is_too_liberal
Now, if you could only apply that same standard to all feeds, instead of hobbling Atom right out of the gate, then we would be in agreement.
Posted by: Mark | Tuesday, January 13, 2004 at 11:42 AM
Visual cues: excellent idea!
I think keeping Atom strictly interpreted XML and not tag soup is an excellent idea. The biggest problem I forsee is all the programmers of blog apps wishing to supply Atom feeds having to learn their programming language's XML libraries to produce valid, well-formed Atom. It's not that hard. In the end, it's not really up to users - it's up to the programmers of the blog apps and how lazy they want to be. Note: I am a programmer and understand that lesser-skilled programmers will definitely flake off and generate their XML using other techniques, such as, echo "", or some kind of template language, which is, IMHO, just as bad as hand-writing your (invalid) XML every time. Yes you can write valid XML using these techniques, but you've gotta know all the rules - whereas with a good XML library, you can be sure that the XML that the library generates is well-formed.
Posted by: Brett Taylor (Glutnix) | Tuesday, January 13, 2004 at 03:55 PM
Visual cues: excellent idea!
I think keeping Atom strictly interpreted XML and not tag soup is an excellent idea. The biggest problem I forsee is all the programmers of blog apps wishing to supply Atom feeds having to learn their programming language's XML libraries to produce valid, well-formed Atom. It's not that hard. In the end, it's not really up to users - it's up to the programmers of the blog apps and how lazy they want to be. Note: I am a programmer and understand that lesser-skilled programmers will definitely flake off and generate their XML using other techniques, such as, echo "", or some kind of template language, which is, IMHO, just as bad as hand-writing your (invalid) XML every time. Yes you can write valid XML using these techniques, but you've gotta know all the rules - whereas with a good XML library, you can be sure that the XML that the library generates is well-formed.
Posted by: Brett Taylor (Glutnix) | Tuesday, January 13, 2004 at 03:55 PM
I'm of two minds about this.
As a standards developer (simulations of communications stuff), I agree with wanting to encourage well formed XML. I know that RSS/Atom standards exist for a reason--to allow everyone to communicate *information*, not just data.
As a reader and feeddemon customer, I'm not happy about this. If you want to encourage well formed feeds, alert the user ("The entered feed does not conform with xml, click here to generate a message to the feed owner") and then allow the *user* to make the choice if they want to continue. I'm not "lazy", but don't penalize me by not *allowing* me to read a feed if they don't comply. Bug the feed owners/developers, *not*, I repeat, *not* the readers/users.
If product a only supports a subset of feeds, and product b supports a superset, which product am I, the user going to invest my $$ in? It has *nothing* to do with laziness on the part of the users.
Posted by: Adin | Tuesday, January 13, 2004 at 04:34 PM
Nick,
As a paying feed demon customer I fully support your intention to only support well-formed atom feeds. Tightening up RSS support would be good too, however I fear the horse may have left the gate here. The visual cue is a good idea, there's some browser on the Mac which does the same thing with invalid HTML.
Ideally, if there's an error parsing an Atom feed, FeedDemon would display a sufficiently detailed error message that the content producers could use it to fix their feeds.
Posted by: Koz | Tuesday, January 13, 2004 at 04:48 PM
re: "Ideally, if there's an error parsing an Atom feed, FeedDemon would display a sufficiently detailed error message that the content producers could use it to fix their feeds."
No, ideally, if there's an error parsing an Atom feed, FeedDemon would display a sufficiently detailed error message that the content producers could use it to fix their feeds... and then FeedDemon would display as much of the feed as possible because 99.99% of your paying customers don't give a rat's ass about this nerdy XML thing y'all keep bickering about, and they probably don't appreciate being treated as guinea pigs in this political ratfight.
Posted by: Mark | Tuesday, January 13, 2004 at 10:22 PM
Hmmm... We've made our aggregator not to allow reading invalid XML feeds several month ago in the bright hope that we could change the world and make it happy. But this has raised so many complaints from our users. And all of people were telling something like "What's the heck? xxxx aggregator reads this feed without problems! why yours is giving me an error?". So I think that developers of very-ultra-liberal-RSS-parsers should be responsible for all the confusion we have in RSS feeds now.
Posted by: Andrew | Wednesday, January 14, 2004 at 08:42 AM
It's me again ;) I'm back just to tell that making the feed of this blog valid could be the first move in the direction to the well-formed world ;) For now this feed is not validated (http://feedvalidator.org/check?url=http://nick.typepad.com/blog/index.rss).
Posted by: Andrew | Wednesday, January 14, 2004 at 08:51 AM
While making FeedDemon not accept invalid XML feeds is a good moral stance, I think it will make FeedDemon uncompetitive in the feed reader market.
To give an example, I tried using Opera as a web browser for a while. I quite liked it, however there are a very small number of sites that don’t work with it very well. As a result I had to keep Internet Explorer available. Seeing no point in using two browsers in parallel I uninstalled Opera.
The same will happen to FeedDemon over time. If someone else produces a more tolerant reader then that will prevail. Ultimately the Reader that reads the most Feeds will become the market leader.
Most developers will only test against one reader, just because they are either lazy, on a deadline, ignorant, etc. As a result, you will never be able to get _all_ feeds to produce well formed XML unless _all_ readers require it.
Why don't you just stick a message up saying something like "The feed you are trying to read has errors in it. Do you want FeedDemon to make a guess at what it meant? Yes / No"
Posted by: Martin Brown | Wednesday, January 14, 2004 at 09:51 AM
>>Ultimately the Reader that reads the most Feeds will become the market leader
If you can make a parser that can read proper xml and decipher my grandmother's squiggles on her postcards (she's 85), what do I care how you worked your magic? Sounds like a good product. All I want to do is read.
Posted by: stylo~ | Wednesday, January 14, 2004 at 11:10 AM
Nicely put Nick.
Martin - this would remove much of the incentive for producers to generate quality feeds. Ok, a lot of tools may support tag soup but that doesn't matter, as long as one or two of the leaders, errm, take a lead.
The big selling points of RSS 2.0 have been (selective) backwards-compatibility and simplicity. Unfortunately I believe the confusion and laxity that those have encouraged have made RSS 2.0 the nail in the coffin for well-formed RSS.
Posted by: Danny | Wednesday, January 14, 2004 at 11:18 AM
Sorry, "...the last nail...".
(Note to self - use preview ;-)
Posted by: Danny | Wednesday, January 14, 2004 at 11:19 AM
Even though the web works differently today, I can't really see much danger or newfangledness in making a program more rigid about a file format. In fact, I think the current situation is quite ridiculous, and it's about time that we do something about it. We can't let producers on the web get away with sloppy content anymore.
FLAs, PDFs, PNGs are always valid - i know these are machine generated, binary yadayada formats – but why should HTML be any different? Running a validator over hand made markup pre-publish ought to be an integral part of web content production, IMHO. And I'm sure application developers will find good ways to do this, as soon as they realize it is the only way.
I'm hoping for Atom and later XHTML 2 to make this clear.
Posted by: Eric Wahlforss | Wednesday, January 14, 2004 at 11:34 AM
re: "PDFs are always valid." Bwahahahahaha. It's stunning how much ignorance there is in this discussion. Suffice to say that Acrobat Reader is an ultra-liberal PDF parser. The fact that you have never noticed this means that *they're doing exactly what clients are supposed to do* -- present you with the information you asked for, without bothering you with technical details of the underlying format that you don't care about anyway.
Posted by: Mark | Wednesday, January 14, 2004 at 12:32 PM
Ok, Mark, you got me. I admit I'm not an expert on file formats - I do know a bit of HTML/XML though. And I still think that the web would benefit in the long run by enforcing more rigidness - I'm not sure how much more though.
I don't see how a browser supporting nice things like inline rdf/svg/mathml/xlinks etc. will ever come to life if we don't start making our documents at least well-formed anytime soon. The tag soup we see everywhere today could have been avoided if browsers where not as liberal, don't you agree? They could be liberal, but not as liberal as they are now. As I mentioned earlier, I think XHTML2 is a good way of restarting after the somewhat chaotic evolution of HTML/XML this far. We've learned a great deal, and I still think future browsers should have a render mode for XHTML2 which is stricter than those of today.
Posted by: Eric Wahlforss | Thursday, January 15, 2004 at 07:53 PM
I strongly favour validation - any guesswork should be resolved by the publisher, not by the consumer.
But I can't find any XML schema for Atom?
The anarcho-syntaxists will hate Atom because they have to code well-formed XML. While validation ayatollahs like me, feel unsecure, because we can't validate inside our XML Spy editors.
Is there any downloadable validator for Atom?
Posted by: Jan Egil Kristiansen | Friday, January 23, 2004 at 09:48 AM
I'm wondering, Nick, has your thinking evolved on this? FeedDemon 1.10 beta 2a happily accepted at least one Atom feed that was not valid according to Mark's Feed Validator. I also saw your Jan 28 comment about HTML: "I consider standards-compliance a goal, not a requirement", which seems apropos to Atom as well.
As a brand new FeedDemon customer I hate the idea that I won't be allowed to read somebody's feed because of an error I have no ability to correct. Flag the feed as invalid, shame the developer, whatever. But please don't punish the reader!
Posted by: Nelson | Tuesday, February 17, 2004 at 05:50 AM