<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xml:base="http://labs.echoditto.com" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
 <title>EchoDitto Labs - Twitterbot: Lessons Learned - Comments</title>
 <link>http://labs.echoditto.com/twitterbot-lessons-learned</link>
 <description>Comments for &quot;Twitterbot: Lessons Learned&quot;</description>
 <language>en</language>
<item>
 <title>Twitterbot: Lessons Learned</title>
 <link>http://labs.echoditto.com/twitterbot-lessons-learned</link>
 <description>&lt;p&gt;It was too terribly long ago that I posted a &lt;a href=&quot;http://labs.echoditto.com/twitterbot&quot;&gt;simple Twitterbot&lt;/a&gt; here.  I&#039;m sure you&#039;ll be absolutely shocked to learn that it had a few, uh, shortcomings.  If you&#039;re making a TwitterBot (and particularly one in Ruby), here&#039;s some advice:&lt;/p&gt;

&lt;dl&gt;
&lt;dt&gt;Forget about AIM&lt;/dt&gt;
&lt;dd&gt;&lt;a href=&quot;http://labs.echoditto.com/rssbot&quot;&gt;I&#039;ve written AIM bots before&lt;/a&gt;, and AIM is the primary IM network we use at EchoDitto.  Consequently, the net-toc gem is the first one I turned to when I decided to write a queryable Twitter bot.  But, as Ethan helped me figure out, Twitter&#039;s AIM interface is a bit flaky, and not at all worth your time.  Stick with Jabber, which is considerably more solid.  This also has the benefit of making the &lt;a href=&quot;http://rubyforge.org/projects/jabber-bot/&quot;&gt;jabber-bot&lt;/a&gt; gem available to you, which should simplify your life considerably.&lt;/dd&gt;

&lt;dt&gt;Making friends is hard&lt;/dt&gt;
&lt;dd&gt;...in Twitter, anyway.  As you&#039;ll no doubt quickly realize, it&#039;s only possible to direct-message Twitter users who are following your account.  That means that in order for you to receive messages, you&#039;ll need to periodically check for new followers of your bot&#039;s Twitter account and start following them.&lt;br/&gt;&lt;br/&gt;

This is nominally accomplished by making an authenticated call to &lt;a href=&quot;http://www.techquilashots.com/2007/04/10/how-to-create-a-twitter-bot/&quot;&gt;http://twitter.com/followers/befriend_all&lt;/a&gt;. Simple right?  The only downside is that it doesn&#039;t work.  At all.  And the problem can&#039;t be solved via the API &amp;mdash; at least, not that I&#039;ve been able to figure out.&lt;br/&gt;&lt;br/&gt;

So you&#039;re left to screen-scrape your way out of this mess.  Here&#039;s some horrifying code that accomplishes the feat via the invaluable &lt;a href=&quot;http://rubyforge.org/projects/mechanize/&quot;&gt;mechanize&lt;/a&gt; gem:&lt;br/&gt;&lt;br/&gt;

&lt;blockquote&gt;&lt;pre&gt;# befriend everyone following your account on Twitter
def befriend_all
   agent = WWW::Mechanize.new

   # log in
   attempts = 3
   begin
      page = agent.get(&#039;http://twitter.com/login&#039;)
   rescue
      sleep 3
      attempts += 3
      retry if attempts&lt;3
   end
   form = nil
   page.forms.each do |f|
      if f.has_field?(&#039;username_or_email&#039;)
         form = f
      end
   end
	
   if form!=nil
      form.username_or_email = YourBotsConfig::TWITTER_user
      form.password = YourBotsConfig::TWITTER_password
      attempts = 0
      begin
         agent.submit(form)
      rescue
         sleep 3
         attempts += 1
         retry if attempts&lt;3
	  end
	
      # grab each non-followed follower and follow them
      result = agent.get(&#039;http://twitter.com/followers&#039;)
      (result/&quot;div.person-actions&quot;).each do |button_container|				
         need_to_follow = button_container.inner_html.scan(/&lt;button\s*class=&quot;small&quot;\s*onclick=&quot;followPerson\((\d+)[^\)\d]*\)[^&quot;]*&quot;&gt;follow&lt;\/button&gt;/i)
         need_to_follow.each do |match|
            user_id = match[0].to_i
            attempts = 0
            begin
               response = Net::HTTP.post_form(URI.parse(&quot;http://#{YourBotsConfig::TWITTER_user}:#{YourBotsConfig::TWITTER_password}@twitter.com/friendships/create/#{user_id}.json&quot;), {})	
            rescue
               sleep 3
               attempts += 1
               retry if attempts&lt;3
            end
         end
      end			
   end
end&lt;/pre&gt;&lt;/blockquote&gt;

Pretty ugly, huh?  This brings up my third point...&lt;/dd&gt;

&lt;dt&gt;Expect network failures.  When possible, fork.&lt;/dt&gt;
&lt;dd&gt;I&#039;m still new to Ruby, as the above code no doubt demonstrates.  So it came as a bit of a surprise to see so many HTTP requests timing out.  It seems that Ruby&#039;s default HTTP timeout is a bit low, but my skills (and level of bravery) aren&#039;t up to the task of adjusting it myself.  Instead I just have my code try the request three times, then give up.  It&#039;s deeply kludgy, but good enough.&lt;br/&gt;&lt;br/&gt;

The situation is exacerbated by Twitter&#039;s not-entirely-infrequent outages, and the fact that my bot&#039;s utility comes from scraping another fairly flaky third party site.  Could I write endless error-handling routines?  Yes.  Yes I could.  But I&#039;d rather just fork a new process and live with the consequences of it occasionally arriving stillborn.  The befriend_all routine is a good example of why this is fine: if you fail to befriend somebody, no big deal &amp;mdash; the bot will presumably get &#039;em when it spawns again in a minute or two.  Above all, avoid risking your daemon&#039;s death at the hands of a failed connection.&lt;br/&gt;&lt;br/&gt;

The downside to this approach is, of course, system resource use.  But given that even the NYTimes has &lt;a href=&quot;http://open.blogs.nytimes.com/2007/10/01/twitter-subscribers-pass-1000/&quot;&gt;just over 1000 Twitter followers&lt;/a&gt;, the odds of your bot dying under an avalanche of Ruby-interpreting processes seems low.  Cross that bridge when you come to it.
&lt;/dd&gt;

&lt;dt&gt;Use the gems!&lt;/dt&gt;
&lt;dd&gt;Although my above code doesn&#039;t demonstrate it, I plan to port things over to one of the available &lt;a href=&quot;http://rubyforge.org/projects/twitter4r/&quot;&gt;Twitter&lt;/a&gt; &lt;a href=&quot;http://rubyforge.org/projects/twitter/&quot;&gt;gems&lt;/a&gt;.  The simplicity of Twitter&#039;s REST API makes using direct HTTP calls sorely tempting &amp;mdash; why incur another dependency, right?  But if you&#039;re like me, you&#039;ll find you&#039;re ultimately better off leaning on more accomplished Rubyists&#039; work.&lt;/dd&gt;

&lt;dt&gt;Talk to Twitter&lt;/dt&gt;
&lt;dd&gt;The guys at Twitter are extremely friendly and helpful.  You may find your bot exhibiting unexpected behavior during its development.  Likely as not, this will be due to Twitter&#039;s abuse-prevention routines.  Things became a lot more comprehensible once Twitter whitelisted my bot and debug-query accounts.  I can&#039;t promise they&#039;ll do the same for you, of course, but if you explain what you&#039;re trying to accomplish I bet they&#039;ll be happy to help.&lt;/dd&gt;
&lt;/dl&gt;

&lt;p&gt;And that brings us up to the current extent of my TwitterBot knowledge.  More updates on my code&#039;s horrific shortcomings as they become apparent...&lt;/p&gt;

</description>
 <comments>http://labs.echoditto.com/twitterbot-lessons-learned#comments</comments>
 <category domain="http://labs.echoditto.com/taxonomy/term/114">aim</category>
 <category domain="http://labs.echoditto.com/taxonomy/term/88">bot</category>
 <category domain="http://labs.echoditto.com/taxonomy/term/118">gems</category>
 <category domain="http://labs.echoditto.com/taxonomy/term/115">jabber</category>
 <category domain="http://labs.echoditto.com/taxonomy/term/117">mechanize</category>
 <category domain="http://labs.echoditto.com/taxonomy/term/2">ruby</category>
 <category domain="http://labs.echoditto.com/taxonomy/term/29">twitter</category>
 <category domain="http://labs.echoditto.com/taxonomy/term/116">twitterbot</category>
 <pubDate>Wed, 07 Nov 2007 09:28:56 -0800</pubDate>
 <dc:creator>Tom</dc:creator>
 <guid isPermaLink="false">51 at http://labs.echoditto.com</guid>
</item>
</channel>
</rss>
