Jason Fried points out an interesting idea for an RSS reader: make it group items that link to the same story. This sounded like a simple task for FeedDemon's XSLT-based newspaper styling - I figured, just use the Muenchian Method to group items that have the same <link>. Problem solved!
Then I remembered that the RSS <link> element is the URL of the item itself rather than the URL of the story being discussed. Oops.
One possible solution is to group items by title, since items that talk about the same thing often have the same title. However, because this isn't always true, grouping items by title isn't entirely reliable. (If you're viewing this in FeedDemon, you can see what I mean by clicking here to apply a newspaper style that groups by title).
What's really needed is a way to group by the links within each item's description, but my limited XSLT experience has left me scratching my head over how to do this. So...are there any XSLT gurus out there who want to take a crack at this?
You should just use some XPath expression like item/description/a@href, but it's prone to be unreliable since a description may include a lot of links... I'll think some more about it.
Posted by: Manuzhai | Wednesday, May 05, 2004 at 05:19 PM
Wouldn't this be difficult or even impossible for escaped descriptions? After all the description element isn't actually XML, it's a big chunk of CDATA that just happens to be HTML when extracted. However feeds that use XHTML in a namespace should be doable.
Given that FeedDemon seems to store the channels 'escaped' rather than 'namespaced' this might not actually be possible?
Posted by: Koz | Wednesday, May 05, 2004 at 05:34 PM
SharpReader has been doing this for about a year now, I believe.
Posted by: Jeremy Zawodny | Wednesday, May 05, 2004 at 07:34 PM
My XSLT is a bit rusty, but could you use the contains() function to look through the descriptions?
Someting like:
xsl:for-each select="item[contains(description,'http:||www.somesite.com/link/url.html')]"
...
/xsl:for-each
Then again, that would mean you'd have to know a link beforehand to try and match against. I'll post this anyway in case it sparks another solution...
Posted by: Dan Cederholm | Thursday, May 06, 2004 at 12:40 AM
I have to agree with Dan. It seems to me like a job for a regular expression - find URL and than check other descriptions for it. You need XSLT version 2 or possibly javascript extension.
Question is, is it necessary to do this process inside XSLT? It could be easier to do that using some programming language.
Posted by: Jan Havrda | Thursday, May 06, 2004 at 05:41 AM
Yes, this could certainly be done within FeedDemon itself. I was just hopeful it could be done in XSLT :)
Posted by: Nick Bradbury | Thursday, May 06, 2004 at 07:17 AM
Try this http://www.write.cz/rssgroup/index.xml
It goes through every description and looks for other descriptions containing the same URL (if some). Maybe it helps. Works under Internet Explorer only.
Here is xml and xsl for download: http://www.write.cz/rssgroup/rssgroup.zip
Posted by: Jan Havrda | Thursday, May 06, 2004 at 09:26 AM
Nick,
Do you know what the link is in the first place? For example, will there be an XSLT parameter or XML node that specifies the link that's being matched against? Your real problem is keeping track of what links have previously been matched against [I think. There may be a solution to that problem as well].
Posted by: Randy Peterman | Thursday, May 06, 2004 at 10:21 AM
Jan -
Very cool. Although while testing, http://www.example.com/ and http://www.example.com/index.htm are picked up as different URL's.
Posted by: David Seguin | Thursday, May 06, 2004 at 01:19 PM
Jan -
Very cool. Although while testing, http://www.example.com/ and http://www.example.com/index.htm are picked up as different URL's.
Posted by: David Seguin | Thursday, May 06, 2004 at 01:20 PM
David your surely right, I've just wanted to show possible way to do it, its gonna need some hacks.
BTW, I've cleaned that code a bit and made a non-javascript version (index2.xls). Download at http://www.write.cz/rssgroup/rssgroup.zip
Posted by: Jan Havrda | Friday, May 07, 2004 at 02:51 AM
Jan, this is great - thanks! I've taken your example and made it into a FeedDemon style, which can be downloaded from http://www.bradsoft.com/feeddemon/getstyles/1.0/Related.fdxsl
Posted by: Nick Bradbury | Friday, May 07, 2004 at 10:14 AM
Nick, looks nice, just one thing - at line 62 in select should be a $url!='' condition, otherwise it makes related all of the news without links.
Posted by: Jan Havrda | Sunday, May 09, 2004 at 01:23 PM
Thanks for the correction, Jan - I've uploaded the changed FDXSL to the same location.
Posted by: Nick Bradbury | Tuesday, May 11, 2004 at 07:08 AM