A few months back I wrote about RSS bandwidth consumption, and this subject is again in the news following Chad Dickerson's recent InfoWorld column about his love/hate relationship with RSS. Dickerson notes that desktop RSS readers which hit a feed too frequently - and then download the feed even when it hasn't changed - are resulting in a huge server load.
However, as Dare Obasanjo points out, many of those complaining about RSS bandwidth consumption fail to configure their own servers to address the problem. Dare shows that InfoWorld's feed supports neither GZip encoding nor conditional HTTP Get, both of which would dramatically decrease RSS bandwidth consumption. The latest RSS reader stats show that all the major ones support these techniques, so make sure your server (and/or the feed itself) supports these techniques. If you have a static feed, chances are your server handles this for you - but if you have a dynamic feed (i.e.: one created on-the-fly with PHP or ASP), you may need to make some changes.
In the past, raising this topic has been followed by naive calls to stop using desktop RSS readers in favor of web-based applications, since web-based aggregators consume less bandwidth. I'm far too biased to argue about desktop vs. web aggregators, but the argument is moot since many people find the UI and feature set of web-based apps too limiting for their needs and will always want a desktop application (witness Outlook vs. HotMail). Arguing for either type of application is pointless, since each will be around for a long time.
BTW, I'm glad to see that Sam Ruby is proposing updating the Atom spec and the feed validator to support HTTP conditional get. My guess is that a lot of bandwidth will be saved once the feed validator warns about feeds that don't take advantage of the If-Modified-Since and If-None-Match HTTP headers.
Oh, and since I mentioned RSS reader stats, I have to get this off my chest: server stats are not an accurate representation of the popularity of individual RSS readers. A number of RSS readers default to checking for updates every hour, whereas FeedDemon defaults to checking every three hours. So, three times as many people would need to use FeedDemon for it to be ranked equally with these other apps.
Nick, from the looks of it, there are 2 Apache compression modules, the official mod_deflate in 2.x and the third party mod_gzip. Which do you recommend? My hosts server actully runs Apache 1.x.
Posted by: Joost Schuur | Wednesday, July 21, 2004 at 01:30 PM
I think JD put it best: would you rather have my 538 KB index.html page or my 10 KB RSS feed hit every hour?
Dar dar dar...
Posted by: JesterXL | Wednesday, July 21, 2004 at 01:49 PM
Nick, totally aggree with - you with one exception:
>> but the argument is moot since many people find the UI
>> and feature set of web-based apps too limiting for
>> their needs and will always want a desktop application
>> (witness Outlook vs. HotMail).
If you look at GMail or OutPost, you will see that it is quite possible for a web applications to deliver a rich (and agile) user experience almost on par with desktop apps.
The big disadvantage with web apps (and the major reason I am sticking with FeedDemon) is that they still don't support offline work in a reasonable way. In all other aspects I am very excited about the pontential of the new generation of web apps.
On the other hand I don't think, web based aggregators would make that much difference with regard to server load. It mostly depends on the update interval and an intelligent updating algorithm (which you decribed perfectly).
Posted by: Markus Breuer | Wednesday, July 21, 2004 at 02:04 PM
Joost, I'm afraid I'm out of my depth when it comes to Apache modules, so a Google search would probably turn up information that's more accurate than what I could say!
Posted by: Nick Bradbury | Wednesday, July 21, 2004 at 03:28 PM
Joost,
mod_gzip is for apache 1.x
mod_deflate is for apache 2.x
So if your host is running 1.x then mod_gzip is your friend.
Posted by: Darryl | Friday, July 23, 2004 at 08:29 AM
Joost,
mod_gzip is for apache 1.x
mod_deflate is for apache 2.x
So if your host is running 1.x then mod_gzip is your friend.
Posted by: Darryl | Friday, July 23, 2004 at 08:30 AM
Because computers today are all syncing to an internet clock, some randomness is needed to avoid on-the-hour surges. RSS clients generally have a period in which to check a feed (e.g., once an hour between x o'clock and y o'clock, which ends up meaning that at x it goes to check). I was thinking when looking at Firefox's new Livemarks setup that such programs should automatically take a random number and use that to set the checkup time within the given period.
Would that help?
Posted by: stylo | Tuesday, August 03, 2004 at 11:41 PM
Is that how they work or do RSS readers check the feeds every "X" minutes from the last time checked? That's how I would do it, so I have to admit this is how I've assumed it was done. How does FeedDemon determine when to check a feed?
Posted by: Bill Curnow | Wednesday, August 04, 2004 at 12:23 AM