Saturday 13 June 2009

Saving Private Twiters - a Blue Peter Approach

Twitter clients are pretty simple beasts and do not handle a basic need - to save your timeline of status updates. Most will get up to the latest 200 since this is the default maximum that an API call will allow. The API does allow for paging - i.e. getting them in groups of 200, back to a max of 3,200 but I don't think that anyone bothers with this.

To a certain extent this is not a problem especially in the way that twitter is currentlybeing used however it does not take a massive leap of imagination to relise that saving the dang things might just be useful. As part of a much larger project I am working on I have the ability to save tweets and realised the other day that others may well want to be able to do the same thing.

So here is a Blue Peter style approach - take a server with PHP and MySQL and stick one together. The code is a stripped down version of its bigger brother.

It goes something like this
  1. Create a connection object for Twitter
  2. Connect and get the last 200 updates in your friends timeline
  3. Load the returned element into an XML object
  4. Iterate through the XML elements and insert them to a database
Then you create a cron job on your server to connect to the server every few minutes - I do it every 2 minutes and then you have saved your tweets.

The first thing you will need to do is to set up a database table. I use one called UserSavedTweets with the following structure:-

CREATE TABLE UserSavedTweets (
UserID int(11) NOT NULL,

ServerID int(11) NOT NULL,

UserHandle varchar(250) NOT NULL,

TweetID bigint(11) NOT NULL,

TweetText varchar(250) NOT NULL,

TweetName varchar(250) NOT NULL,

TweetScreenName varchar(250) NOT NULL,

TweetPrfImgUrl varchar(250) NOT NULL,

TweetCreated datetime NOT NULL,

TweetURL varchar(250) NOT NULL,

PRIMARY KEY (UserID,ServerID,UserHandle,TweetID)
)

A couple of things to note here. The first is that this table is designed to handle multiple users on multiple twitter type servers, such as Laconi.ca based servers so the the primary key is a compound of the the various ID's I use in other tables together with the TweetID.

As for the TweetID itself you will notice that this is a bigint which more than happily takes care of the Twitpocalypse potential issues - at least for the next couple of months anyhow.

Now we need to have a method to go get the tweets. I found and used with some alteration a twitter connection object from David Grudl called twitter.class.php

Here is the the current code for that which you can save in twitter.class.php


/**
* Twitter for PHP - library for sending messages to Twitter and receiving status updates.
*
* @author David Grudl
* @copyright Copyright (c) 2008 David Grudl
* @license New BSD License
* @link http://phpfashion.com/
* @version 1.0
*/
class Twitter
{
/** @var int */
public static $cacheExpire = 1800; // 30 min

/** @var string */
public static $cacheDir;

/** @var user name */
private $user;

/** @var password */
private $pass;

/** tweet retrieval param */
private $retrieve_count;

/** the server base address for the api **/
private $serverapi;



/**
* Creates object using your credentials.
* @param string user name
* @param string password
* @throws Exception
*/
public function __construct($user, $pass)
{
if (!extension_loaded('curl')) {
throw new Exception('PHP extension CURL is not loaded.');
}

$this->user = $user;
$this->pass = $pass;
$this->retrieve_count = 20;
$this->serverapi = "http://twitter.com";
}

public function set_retrieve_count( $count ){
$this->retrieve_count = $count;
}

public function set_serverapi( $serverapi ){
$this->serverapi = $serverapi;
}



/**
* Tests if user credentials are valid.
* @return boolean
* @throws Exception
*/
public function authenticate()
{
$xml = $this->httpRequest('http://twitter.com/account/verify_credentials.xml');
return (bool) $xml;
}



/**
* Sends message to the Twitter.
* @param string message encoded in UTF-8
* @return mixed ID on success or FALSE on failure
*/
public function send($message)
{
$xml = $this->httpRequest(
'https://twitter.com/statuses/update.xml',
array('status' => $message)
);
return $xml && $xml->id ? (string) $xml->id : FALSE;
}



/**
* Returns the 20 most recent statuses posted from you and your friends (optionally).
* Ammended to allow more status returns - max is 200
* @param bool with friends?
* @return SimpleXMLElement
* @throws Exception
*/
public function load($withFriends)
{
$line = $withFriends ? 'friends_timeline' : 'user_timeline';

$xml = $this->httpRequest("$this->serverapi/statuses/$line/$this->user.xml?count=$this->retrieve_count", FALSE);

if (!$xml || !$xml->status) {
throw new Exception('Cannot load channel.');
}
return $xml;
}



/**
* Process HTTP request.
* @param string URL
* @param array of post data (or FALSE = cached get)
* @return SimpleXMLElement|FALSE
*/
private function httpRequest($url, $post = NULL)
{
if ($post === FALSE && self::$cacheDir) {
$cacheFile = self::$cacheDir . '/twitter.' . md5($url) . '.xml';
if (@filemtime($cacheFile) + self::$cacheExpire > time()) {
return new SimpleXMLElement(@file_get_contents($cacheFile));
}
}

$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERPWD, "$this->user:$this->pass");
curl_setopt($curl, CURLOPT_HEADER, FALSE);
curl_setopt($curl, CURLOPT_TIMEOUT, 20);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($curl, CURLOPT_HTTPHEADER, array('Expect:'));

if ($post) {
curl_setopt($curl, CURLOPT_POST, TRUE);
curl_setopt($curl, CURLOPT_POSTFIELDS, $post);
}
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE); // no echo, just return result
$result = curl_exec($curl);
$ok = curl_errno($curl) === 0 && curl_getinfo($curl, CURLINFO_HTTP_CODE) === 200; // code 200 is required

if (!$ok) {
if (isset($cacheFile)) {
$result = @file_get_contents($cacheFile);
if (is_string($result)) {
return new SimpleXMLElement($result);
}
}
return FALSE;
}

if (isset($cacheFile)) {
file_put_contents($cacheFile, $result);
}

return new SimpleXMLElement($result);
}

}
?>

So now we need the script to use this, get the latest posts and sae them to your table.

require_once 'twitter.class.php';

// create a twitter account object change the first param to you username and the second to your password
$twitter = new Twitter('twitteraccountname', 'xxxxxxxxxx');
$twitter->set_retrieve_count(200);
$twitter->set_serverapi('http://twitter.com');

$withFriends = TRUE;
$channel = $twitter->load($withFriends);

include("dblib.php");

$link = openDB();

// hard coded elements - normally picked up from current user and server usage
$UserID = 1;
$ServerID = 1;
$UserHandle = 'twitteraccountname';

// IMPORTANT - need to get some validation to prevent injection attacks here



foreach ($channel->status as $status){
$TweetID = $status->id;
$TweetText = $status->text;
$TweetText = stripslashes($TweetText);
$TweetText = mysql_real_escape_string($TweetText);
$TweetName = $status->user->name;
$TweetScreenName = $status->user->screen_name;
$TweetPrflImgUrl = $status->user->profile_image_url;
$TweetCreated = date("Y-m-d H:i:s", strtotime($status->created_at));
$TweetURL = $status->user->profile_image_url;

$query = "Insert into UserSavedTweets values($UserID,$ServerID,'$UserHandle',$TweetID,'$TweetText','$TweetName','$TweetScreenName','$TweetPrflImgUrl','$TweetCreated','$TweetURL')";

$result = mysql_query( $query );

if ($result == TRUE){
echo "Inserted ".$TweetText."

";
} else {
echo "No Insertion $query

";
}
}

?>

And thats it for the script. Then on the server get the cron jobs up and get it running.

Obviously you would need to be able to have a way to access them but that is pretty straightforward standard php MySql stuff.

No comments:

Post a Comment