« FeedDemon and well-formed Atom feeds | Main | Another Thought Experiment »

Tuesday, January 13, 2004

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

I agree with you. XML Should be well formed. You had to do what you had to do with RSS but somewhere is has to stop.

My question is, is there a RSS XML validation program?

Also I have a number of clients that send out news letters. Are there tools to create and validate RSS and/or Atom?

BTW, I really like Feed Demon. Of course I use TopStyle and Homesite/CF Studio.

What's Next ??

How about not being so forgiving about RSS?

I was trying to tighten up RSS, and everyone got mad at me, somehow I was interfering with their right to be expressive.

So now most people see it the same way, let's start getting more picky about what we are willing to accept as RSS.

I'll work with you on this, if you like. There is a way to do it.

Thanks for commenting here, Dave. It would be hard for any aggregator to fail to parse feeds that it previously handled just fine, but perhaps there's another way to handle this? Bill Kearney once suggested that FeedDemon show a different icon for invalid feeds, and this sounds like a good way to do it. This way users could still read the content, but they (and, perhaps more importantly, the feed authors) would have a visual cue that the feed has problems. Perhaps this could provide just enough incentive for feed authors to create valid feeds without making it a problem for end users?

https://feedvalidator.org/ will validate RSS and Atom feeds. It is a free service, and the code is open source.

+1 on visual cue of invalid feeds. Maybe an icon, click for more info, with a link to validate the feed at the feed validator, and maybe contact info culled from the feed itself, if any. I suggested this a year and a half ago: https://diveintomark.org/archives/2002/08/20/how_liberal_is_too_liberal

Now, if you could only apply that same standard to all feeds, instead of hobbling Atom right out of the gate, then we would be in agreement.

Visual cues: excellent idea!
I think keeping Atom strictly interpreted XML and not tag soup is an excellent idea. The biggest problem I forsee is all the programmers of blog apps wishing to supply Atom feeds having to learn their programming language's XML libraries to produce valid, well-formed Atom. It's not that hard. In the end, it's not really up to users - it's up to the programmers of the blog apps and how lazy they want to be. Note: I am a programmer and understand that lesser-skilled programmers will definitely flake off and generate their XML using other techniques, such as, echo "", or some kind of template language, which is, IMHO, just as bad as hand-writing your (invalid) XML every time. Yes you can write valid XML using these techniques, but you've gotta know all the rules - whereas with a good XML library, you can be sure that the XML that the library generates is well-formed.

Visual cues: excellent idea!
I think keeping Atom strictly interpreted XML and not tag soup is an excellent idea. The biggest problem I forsee is all the programmers of blog apps wishing to supply Atom feeds having to learn their programming language's XML libraries to produce valid, well-formed Atom. It's not that hard. In the end, it's not really up to users - it's up to the programmers of the blog apps and how lazy they want to be. Note: I am a programmer and understand that lesser-skilled programmers will definitely flake off and generate their XML using other techniques, such as, echo "", or some kind of template language, which is, IMHO, just as bad as hand-writing your (invalid) XML every time. Yes you can write valid XML using these techniques, but you've gotta know all the rules - whereas with a good XML library, you can be sure that the XML that the library generates is well-formed.

I'm of two minds about this.

As a standards developer (simulations of communications stuff), I agree with wanting to encourage well formed XML. I know that RSS/Atom standards exist for a reason--to allow everyone to communicate *information*, not just data.

As a reader and feeddemon customer, I'm not happy about this. If you want to encourage well formed feeds, alert the user ("The entered feed does not conform with xml, click here to generate a message to the feed owner") and then allow the *user* to make the choice if they want to continue. I'm not "lazy", but don't penalize me by not *allowing* me to read a feed if they don't comply. Bug the feed owners/developers, *not*, I repeat, *not* the readers/users.

If product a only supports a subset of feeds, and product b supports a superset, which product am I, the user going to invest my $$ in? It has *nothing* to do with laziness on the part of the users.

Nick,

As a paying feed demon customer I fully support your intention to only support well-formed atom feeds. Tightening up RSS support would be good too, however I fear the horse may have left the gate here. The visual cue is a good idea, there's some browser on the Mac which does the same thing with invalid HTML.

Ideally, if there's an error parsing an Atom feed, FeedDemon would display a sufficiently detailed error message that the content producers could use it to fix their feeds.

re: "Ideally, if there's an error parsing an Atom feed, FeedDemon would display a sufficiently detailed error message that the content producers could use it to fix their feeds."

No, ideally, if there's an error parsing an Atom feed, FeedDemon would display a sufficiently detailed error message that the content producers could use it to fix their feeds... and then FeedDemon would display as much of the feed as possible because 99.99% of your paying customers don't give a rat's ass about this nerdy XML thing y'all keep bickering about, and they probably don't appreciate being treated as guinea pigs in this political ratfight.

Hmmm... We've made our aggregator not to allow reading invalid XML feeds several month ago in the bright hope that we could change the world and make it happy. But this has raised so many complaints from our users. And all of people were telling something like "What's the heck? xxxx aggregator reads this feed without problems! why yours is giving me an error?". So I think that developers of very-ultra-liberal-RSS-parsers should be responsible for all the confusion we have in RSS feeds now.

It's me again ;) I'm back just to tell that making the feed of this blog valid could be the first move in the direction to the well-formed world ;) For now this feed is not validated (https://feedvalidator.org/check?url=https://nick.typepad.com/blog/index.rss).

While making FeedDemon not accept invalid XML feeds is a good moral stance, I think it will make FeedDemon uncompetitive in the feed reader market.

To give an example, I tried using Opera as a web browser for a while. I quite liked it, however there are a very small number of sites that don’t work with it very well. As a result I had to keep Internet Explorer available. Seeing no point in using two browsers in parallel I uninstalled Opera.

The same will happen to FeedDemon over time. If someone else produces a more tolerant reader then that will prevail. Ultimately the Reader that reads the most Feeds will become the market leader.

Most developers will only test against one reader, just because they are either lazy, on a deadline, ignorant, etc. As a result, you will never be able to get _all_ feeds to produce well formed XML unless _all_ readers require it.

Why don't you just stick a message up saying something like "The feed you are trying to read has errors in it. Do you want FeedDemon to make a guess at what it meant? Yes / No"

>>Ultimately the Reader that reads the most Feeds will become the market leader

If you can make a parser that can read proper xml and decipher my grandmother's squiggles on her postcards (she's 85), what do I care how you worked your magic? Sounds like a good product. All I want to do is read.

Nicely put Nick.

Martin - this would remove much of the incentive for producers to generate quality feeds. Ok, a lot of tools may support tag soup but that doesn't matter, as long as one or two of the leaders, errm, take a lead.

The big selling points of RSS 2.0 have been (selective) backwards-compatibility and simplicity. Unfortunately I believe the confusion and laxity that those have encouraged have made RSS 2.0 the nail in the coffin for well-formed RSS.

Sorry, "...the last nail...".

(Note to self - use preview ;-)

Even though the web works differently today, I can't really see much danger or newfangledness in making a program more rigid about a file format. In fact, I think the current situation is quite ridiculous, and it's about time that we do something about it. We can't let producers on the web get away with sloppy content anymore.
FLAs, PDFs, PNGs are always valid - i know these are machine generated, binary yadayada formats – but why should HTML be any different? Running a validator over hand made markup pre-publish ought to be an integral part of web content production, IMHO. And I'm sure application developers will find good ways to do this, as soon as they realize it is the only way.
I'm hoping for Atom and later XHTML 2 to make this clear.

re: "PDFs are always valid." Bwahahahahaha. It's stunning how much ignorance there is in this discussion. Suffice to say that Acrobat Reader is an ultra-liberal PDF parser. The fact that you have never noticed this means that *they're doing exactly what clients are supposed to do* -- present you with the information you asked for, without bothering you with technical details of the underlying format that you don't care about anyway.

Ok, Mark, you got me. I admit I'm not an expert on file formats - I do know a bit of HTML/XML though. And I still think that the web would benefit in the long run by enforcing more rigidness - I'm not sure how much more though.
I don't see how a browser supporting nice things like inline rdf/svg/mathml/xlinks etc. will ever come to life if we don't start making our documents at least well-formed anytime soon. The tag soup we see everywhere today could have been avoided if browsers where not as liberal, don't you agree? They could be liberal, but not as liberal as they are now. As I mentioned earlier, I think XHTML2 is a good way of restarting after the somewhat chaotic evolution of HTML/XML this far. We've learned a great deal, and I still think future browsers should have a render mode for XHTML2 which is stricter than those of today.

I strongly favour validation - any guesswork should be resolved by the publisher, not by the consumer.

But I can't find any XML schema for Atom?

The anarcho-syntaxists will hate Atom because they have to code well-formed XML. While validation ayatollahs like me, feel unsecure, because we can't validate inside our XML Spy editors.

Is there any downloadable validator for Atom?

I'm wondering, Nick, has your thinking evolved on this? FeedDemon 1.10 beta 2a happily accepted at least one Atom feed that was not valid according to Mark's Feed Validator. I also saw your Jan 28 comment about HTML: "I consider standards-compliance a goal, not a requirement", which seems apropos to Atom as well.

As a brand new FeedDemon customer I hate the idea that I won't be allowed to read somebody's feed because of an error I have no ability to correct. Flag the feed as invalid, shame the developer, whatever. But please don't punish the reader!

The comments to this entry are closed.