was looking over my shoulder while I was reading feeds at the airport yesterday, and he noticed that I have a feed for Google-related posts at Slashdot
. I told him I was scraping it together because Slashdot doesn't offer topic feeds (and I don't want to see everything
at Slashdot), and Matt thought I should share the rss-generating love with the world. I agreed, and here we are.
Here's the script I'm using to scrape Slashdot. It's in Perl, and you'll need a couple modules:
. Once installed, grab the code: slashfeed.pl
You'll also need the numeric topic ID for any Slashdot topic you want to track. They're easy to find. Those big icons in any Slashdot post link to a topic page. Click on one of those, and look for a number in the URL. For example, the Slashdot Google Topic Page is here:
in the URL. That's your Slashdot topic ID for posts about Google. You can browse the directory of all available Slashdot topics at the top of the Slashdot Search
To generate an RSS feed full of Slashdot Google goodness, run the script from a command prompt, passing in a topic ID like this:
% perl slashfeed.pl 217
The script will spit out a file called
that contains the latest Google-related posts, RSS style. Just make sure the script saves this file to a publicly addressable web folder (you might need to tweak the output file path on line 55). The final URL should look something like:
Throw your new URL in your feed reader, and run the script on a regular basis with
or Windows Task Scheduler. That's all there is to building a topic-specific Slashdot feed.
Scaping is notoriously brittle, so if Slashdot changes their HTML this script will break. If that happens, view source on the Slashdot topic page and rewrite the regular expressions on line 39 or so of the script. That's the only labor-intensive bit in this script.