Partial-text RSS feeds are a pet peeve of mine. I'm not alone: I've read about Dave Winer and Steve Rubel's dislike of the practice. I'm sure there are a lot of other RSS users who are similarly irked by it.
So, after having a post-workout algorithmic epiphany (it's the best time for them), I started work on a little project to fix this annoyance — and ended up quite pleased with the result. You might find it useful, too: it's a little script that creates full-text RSS feeds from partial feeds. Just enter the URL of a partial feed in the box below and hit submit. You'll be directed to a URL that will (hopefully) provide a full-text version of the feed you specified.
I've been through a few different versions of the algorithm, but this one seems to be fairly universal and stable. It won't work for every partial-text feed, but it seems to work for a lot of them. I'm sure it could be better, which tempts me to open source the algorithm and invite people to improve upon it. But I won't — not yet, anyway.
I'm sensitive to the pressures that make bloggers use partial text feeds — some of my friends depend on selling advertising to support their sites. Unfortunately, RSS simply isn't respected by marketers and their clients. Offering a full text feed means fewer page views, which means less revenue — I've been told this bluntly by a friend who wanted to offer full text, did so, then noticed his revenues were shrinking. It's hard to fault him for returning to partial-text feeds.
But this situation isn't a problem with RSS; it's a problem with the ad industry. It's long past time for people to realize that if they give content away on the web they'll be unable to control how others choose to consume it. Inconveniencing users is not an acceptable solution to advertisers' inability to adopt new metrics.
Still, I wouldn't want to offer a feature that middlemen can resell at the expense of bloggers. So while I do want to open this up, I don't want to make things easy for the unscrupulous. This feature does need to pass out of my hands — its proper place is in the RSS reader, both for performance reasons and in order to eliminate one class of countermeasures that bloggers could take. Maybe I'll try my hand at adapting the code for Vienna.
A few technical notes: depending on the site, some entries may come back with comments or other cruft attached. Fellow geeks can trim those off by specifying URL-encoded regexes, passed in the querystring as parameters regex0 – regex9 (note that an outstanding issue with PHP magic quotes means that the + character doesn't work; use {1,} instead). I'd encourage users who create regexes for feeds to share them by tagging the URL with "fulltextrss" on del.icio.us. There are already a few examples available here.
Finally, please note that the service employs PEAR's function caching on a 15 minute timeout. If the results you're getting aren't up-to-date, just be patient (or alter one of the regex parameters).
I love you. Great work.
I love you. Great work.
Hi I love this tool. What
Hi
I love this tool.
What regex expression would I use to remove all images?
Thanks
You should be able to use
You should be able to use one like the following:
regex0=%2F%3Cimg.%2A%3F%3E%2Fi
for example, here's this blog's feed without images (not that there are many):
http://labs.echoditto.com/projects/fulltextrss/?url=http://labs.echoditt...
Thanks, Tom! This worked
Thanks, Tom! This worked like a charm on the feed I was subscribing to.
Awesome!
Awesome!
freakonomics
freakonomics http://freakonomics.blogs.nytimes.com/ has managed to beat your tool. Is there any fix?
Well, no, there's no fix to
Well, no, there's no fix to the issue -- they've stopped putting excerpts in the description field, which prevents the general-purpose tool from being used.
But I think someone's taken my advice and produced a dedicated full-text feed:
http://feeds.feedburner.com/freakonomics-full
Tom - Blogs that use the
Tom -
Blogs that use the [read more...] links in their feed seem to defeat your web service. Ars Technica's is a good example - their feedlink here: http://feeds.arstechnica.com/arstechnica/BAaf
Great work.
- todd
after facing some
after facing some difficulties, finally it works for my 'test blog'.
However it is disturbing my adsense block i.e no ads shown at provided place
pls check my blog and provide some feedback
Hmm. You might have to
Hmm. You might have to provide more detail -- it's plausible that it'd strip out adsense, but I'm not certain enough about what you're referring to to comment intelligently about it.
Hi, I need same system which
Hi,
I need same system which will produce Clean Full Text RSS. I need an ability to produce text only or skipping something like Image etc on the system. Can anyone code for me. I am ready to pay upto 30$
Please contact me at info@rapidshareonline.com
how can I have plain text
how can I have plain text full rss, means no html tags
i.e
Very interesting. Any chance
Very interesting. Any chance you release the code for this ?
I'm happy to share my code
I'm happy to share my code on a case-by-case basis, but I'm wary of releasing it completely into the wild for the reasons mentioned in the post -- it could be used to divert revenue from content authors to rent-seeking third parties.
Shoot me an email (tom (at) echoditto (dot) com) and I'll be happy to talk to you about how I got this thing working.
I'm sorry Ramesh, I'm afraid
I'm sorry Ramesh, I'm afraid I don't really understand what you're asking. Is the issue the high-ascii characters? Those are admittedly a consistent problem with PHP -- which this is. Maybe you could try passing the feed through Yahoo Pipes? I'm afraid I'm not prepared to tackle unicode support.
Cool tool! I'm using it with
Cool tool! I'm using it with www.Feedity.com for custom RSS web feeds.... awesome combo :)
Thanks a bunch for this.
Thanks a bunch for this. Very useful tool
hi Tom, this is an awesome
hi Tom,
this is an awesome tool!
It searchs for an update every 15 minutes? Did I get this right?
The server is very slow at the moment - you have really to give this out of hands.
I don't need to know the algorithm for striping all unwanted tags, but maybe you can explain us how to set up such a service. I look for a "homemade Yahoo Pipes" for a long time. You used SimplePie?
Could you send the source
Could you send the source code via e-mail? I'll look on that and make some upgrade, than send results for you.
hi, could i also get the
hi, could i also get the source code via e-mail? i really want to update some features etc... thanks! And thanks alot for all your work!
Interesting thing. Yet, it
Interesting thing. Yet, it doesn't seem to work with Yahoo Groups. At least not with mine. Try: http://rss.groups.yahoo.com/group/thing-frankfurt/rss
Stefan
I know this is an old post
I know this is an old post but I love you. That is all.
im realy intrigued by your
im realy intrigued by your simple script, and im just asking for protected access to the scripts you used in your little tool, i'd actualy be willing to pay you, so please send an email to willports@gmail.com so we can talk further into the subject thanks and what a great tool
Vooovv Super ! It's a
Vooovv Super ! It's a workink thanks my friend very good.
Thank you very nice good
Thank you very nice good work...
i am running my video blog
i am running my video blog :D Good mann.
thank you my friend..
thank you my friend..
himmm that is super !
himmm that is super !
Hehe, nice, although partial
Hehe, nice, although partial rsses are useful for ppl with limited traffic
Vayyy thank you very much.
Vayyy thank you very much. Good job...
Wow, this tool is perfect.
Wow, this tool is perfect. Exactly what I was looking for. A++ from a FeedJournal user.
Post new comment