Freakonomics and Partial Feeds

Tom's picture
Tags: 

The popular Freakonomics Blog has recently moved to the New York Times, and along the way it's dropped its full-text RSS feed. Stephen Dubner wrote a thoughtful explanation of the reasons for the switch — basically, advertisers aren't comfortable with RSS — but judging from the comment section the blog's readers are still upset.

Well, let me offer a gentle reminder that our very own full text RSS tool continues to work, and was designed for exactly this purpose. I've tested it with the new, truncated Freakonomics feed and it works great. Why not give it a try and help push advertisers just a little bit closer to grappling with the internet?

Freakonomics full-text feed

UPDATE: Alas! It looks like the Freakonomics authors have adopted the intermittent and irritating habit of writing descriptions in the RSS description field rather than including excerpts of the actual text. That confuses our general-purpose algorithm.

However, it won't stop a dedicated Freakonomics fan from creating a blog-specific script to provide full feeds. Here, I'll even get them started:

m/<div\s+class="post\-content">(.*?)\-+>/i
s/<\/?div[^>]*>//igx;

could you post a link for

could you post a link for the full feed that works please? Thanks in advance

You might want to consult

You might want to consult the comments on the Freakonomics post -- looks like someone's gone ahead and supplied a link.

If you're ever so inclined,

If you're ever so inclined, you should take a look at Hpricot, the phenomenal Ruby-based HTML scraping library that supports XPath and CSS-style searching. It beats the pants of RegExps.

http://code.whytheluckystiff.net/hpricot/

I wrote my own Full Text Freakonomics feed in a few minutes. It looks like something like this:

feed.items.each { |item|
                doc = Hpricot(open(item.link))
                item.description = (doc/".post-content").inner_html
        }
 

hi i understand you were

hi i understand you were successful in creating a "Full Text Freakonomics feed", i was just wondering if i could please have a copy of the script or atleast instructions into making the script, i would be forever thankful,

thanks for your time

contact me at willports@gmail.com

That's fantastic, Eli. I'm

That's fantastic, Eli. I'm a big fan of Why's stuff, although I have to admit that I haven't gotten as far into Ruby as I'd like. Hpricot looks very interesting. I was under the impression that Beautiful Soup was the Ruby screen-scraper of choice, but hpricot's pedigree (and jQuery namedropping) has definitely got my attention.

It's great to have another screenscraping tool. I've also been wanting to try Adrian Holovaty's templatemaker. I'm not a huge Python fan (although RoR's scaling problems are making me think about giving Django a closer look), but this looks like very impressive software to me.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockcode>
  • Lines and paragraphs break automatically.
  • You may post block code using <blockcode [type="language"]>...</blockcode> tags. You may also post inline code using <code [type="language"]>...</code> tags.

More information about formatting options

Captcha
Are you a robot? We usually like robots, but not in our comments.
Copy the characters (respecting upper/lower case) from the image.