When a website has to syndicate a list of items which is frequently updated, it can place an xml document conforming to certain standards where these items are kept (usually only the last ones published, and with a shortened description in place of the content to keep the document lean). The typical formats used are Atom and Rss, and since the main feed aggregators supports both they are somewhat equivalent.
The list of items can be everything: Invisible to the eye, this blog, publishes a feed for the new posts and for the comments; Wikipedia publishes the list of changes to a page; Twitter the list of an user updates.
Feeds are normally generated by a server-side script, in my case the same that manages blog posts; it can derive feed fields (title, text, url) from the posts table in the database. When a feed is in place, the tedious process of open the main page to check if new articles have been published is standardized and can be automatically done by a program or a web application for us.
The last step is to take advantage of autodiscovery: placing a special link tag in the head section of html documents can link to a xml feed which will be displayed by browsers as available for subscribing. In Firefox, for instance, a small icon appears in the location bar where a feed exists as an alternate version of a page content.
On the other side, feed aggregators manage a list of feeds you subscribed to. When you discover a feed, you can put the url in a program or web application that will check periodically, every few hours, the feed for you.
Every time the aggregator gathers data from feeds, it lists the new items that you have not already read and organizes them as you like. When you want to check your list of 100 preferred blogs for new content, the only action needed on your side is to open Google Reader and see the list of new articles. Every post has a globally unique url which act as a primary key.
I prefer web application for aggregating data since they mantain the list of current unread items in every machine where I'm viewing them. Ideally, they can also do only one ping for every feed even if thousands of users are subscribed: they are the Blackboard of this pattern.
In a real publish/subscribe system the websites should notify the aggregator of new items, but the http technology is limited and the inverse process happens to emulate a push style. However, some extension to this mechanism is in development.
Wrapping and mashups
Once a standardized way to let content flow such as xml feeds has been developed, new ideas come up and xml is a pillar of web 2.0:
- FeedBurner is a wrapping service that envelopes a feed giving you a nice, short url: then you can link in your pages the FeedBurner version and gain automatic statistics and optional reformatting, and also embellishment of the syndicated data when viewed in-browser (with links and buttons to pass the feed to the most famous web based aggregators).
- A mashup is an application that combines external sources of content and provides a unique view of them. For instance if you are a fan of php, you can subscribe to a feed of a php mashup which publishes the best php articles it finds scanning thousands of feeds around the web, filtering out the bad choices.
And if you do not have a feed aggregator yet, check out Google Reader.