Tuesday 16 June 2009

Talk About A Revolution

June 15th 2009 was a strange day. Hundreds of thousands, possibly over a million, people on the streets in Iran protesting over the election results that the protesters thought had been rigged. Is this the start of a revolution? I am writing this a day later and I do not know.

What I do know is that the way it was perceived here in the west was very different to how it would have been just a year ago and that was at least in part down to twitter. During the day there were accounts that had decent claims to being genuine voices from Tehran keeping the world updated on events and managing to get photo and video files out to support their claims. So persuasive were these voices that they were getting the traditional news media to follow what they were saying almost in priority to their own on the spot journalists and traditional sources.

This was the true revolution of June 15th. As information was coming out of Iran it was being viewed by thousands and then resent (retweeted) so that it would be viewed by even more. All of this in pretty much real time and it had nothing to do with the large news agencies. In fairness to the news agencies they are beginning to catch on however there is resistance to what they see as their position as the true arbiters of quality reporting.

This is a tweet from Jeff Jarvis from the afternoon of 15th June "I emphasized to a reporter today that Twitter is not the news source. It's a source of tips & temperature & sources. Reporting follows." Also from his site (posted June 15th) comes the following on the way that the NY Times operates "Because The Times’ brand hinges on it as a product that has been curated and edited and checked and polished - note editor Bill Keller’s language on The Daily Show about his package - it finds itself in dangerous territory trying to compete in real time with those whose brand expectations are entirely different."

According to the Bio on his site Jeff is associate professor and director of the interactive journalism program at the City University of New York’s new Graduate School of Journalism. Well I have news for you Jeff, if you think that any news organisation can get it perfectly right and polished you are deluded and yesterday provided the perfect example.

This is what the editor of the curated and edited and checked and polished New York Times had to say in the early hours of June 15th

Leader Emerges With Stronger Hand

Published: June 15, 2009
President Mahmoud Ahmadinejad’s victory demonstrated that he is the shrewd front man for an elite more unified than at any time since 1979.

Whilst you would now have to subscribe to read the article I can tell you that it was proclaiming that it was a done deal and the middle classes were in Iran were resigned to their fate. Just how wrong can you get it.

So here is the news for you Jeff and for all of those journalists who wish to believe that they are a special breed gifted with superhuman insight and the ability to distill a story for general public consumption. The twitter community and its successors will beat you to the story every time. Furthermore, they will be the people who are the experts in the field and, shock horror, they may even be able to string a sentence or two together. Then the story will be out there and if it gains a following it will spread like wild fire.

This does not mean that journalists are an endangered species, just that they are going to be changing the way they operate in the future. As with any source of information there will be the good, the bad and the disingenuous - these will need to be checked and validated.

Following on from that, there will be the need to draw in comment from other domain experts who are not necessarily directly involved in the main proceedings, for example David Miliband, the UK Foreign Minister, was interviewed on BBC Radio 4 in the morning of 16th June for the UK government's view on the events in Iran. This could not have been done via social networking systems on the Internet.

A news organisation and the journalists working for them can act as a ringmaster in an ever changing circus of events. Constantly watching the crowd to see the news as it unfolds and vetting the shouts from the audience to allow those who have something valuable to say to step into the ring whilst at the same time getting involvement from those who can be invited directly to the ring from outside. They can then step in and lead that conversation rather in the manner of anenormous audience participation show. That, I believe is the future of journalism.

Saturday 13 June 2009

Saving Private Twiters - a Blue Peter Approach

Twitter clients are pretty simple beasts and do not handle a basic need - to save your timeline of status updates. Most will get up to the latest 200 since this is the default maximum that an API call will allow. The API does allow for paging - i.e. getting them in groups of 200, back to a max of 3,200 but I don't think that anyone bothers with this.

To a certain extent this is not a problem especially in the way that twitter is currentlybeing used however it does not take a massive leap of imagination to relise that saving the dang things might just be useful. As part of a much larger project I am working on I have the ability to save tweets and realised the other day that others may well want to be able to do the same thing.

So here is a Blue Peter style approach - take a server with PHP and MySQL and stick one together. The code is a stripped down version of its bigger brother.

It goes something like this
  1. Create a connection object for Twitter
  2. Connect and get the last 200 updates in your friends timeline
  3. Load the returned element into an XML object
  4. Iterate through the XML elements and insert them to a database
Then you create a cron job on your server to connect to the server every few minutes - I do it every 2 minutes and then you have saved your tweets.

The first thing you will need to do is to set up a database table. I use one called UserSavedTweets with the following structure:-

CREATE TABLE UserSavedTweets (
UserID int(11) NOT NULL,

ServerID int(11) NOT NULL,

UserHandle varchar(250) NOT NULL,

TweetID bigint(11) NOT NULL,

TweetText varchar(250) NOT NULL,

TweetName varchar(250) NOT NULL,

TweetScreenName varchar(250) NOT NULL,

TweetPrfImgUrl varchar(250) NOT NULL,

TweetCreated datetime NOT NULL,

TweetURL varchar(250) NOT NULL,

PRIMARY KEY (UserID,ServerID,UserHandle,TweetID)
)

A couple of things to note here. The first is that this table is designed to handle multiple users on multiple twitter type servers, such as Laconi.ca based servers so the the primary key is a compound of the the various ID's I use in other tables together with the TweetID.

As for the TweetID itself you will notice that this is a bigint which more than happily takes care of the Twitpocalypse potential issues - at least for the next couple of months anyhow.

Now we need to have a method to go get the tweets. I found and used with some alteration a twitter connection object from David Grudl called twitter.class.php

Here is the the current code for that which you can save in twitter.class.php


/**
* Twitter for PHP - library for sending messages to Twitter and receiving status updates.
*
* @author David Grudl
* @copyright Copyright (c) 2008 David Grudl
* @license New BSD License
* @link http://phpfashion.com/
* @version 1.0
*/
class Twitter
{
/** @var int */
public static $cacheExpire = 1800; // 30 min

/** @var string */
public static $cacheDir;

/** @var user name */
private $user;

/** @var password */
private $pass;

/** tweet retrieval param */
private $retrieve_count;

/** the server base address for the api **/
private $serverapi;



/**
* Creates object using your credentials.
* @param string user name
* @param string password
* @throws Exception
*/
public function __construct($user, $pass)
{
if (!extension_loaded('curl')) {
throw new Exception('PHP extension CURL is not loaded.');
}

$this->user = $user;
$this->pass = $pass;
$this->retrieve_count = 20;
$this->serverapi = "http://twitter.com";
}

public function set_retrieve_count( $count ){
$this->retrieve_count = $count;
}

public function set_serverapi( $serverapi ){
$this->serverapi = $serverapi;
}



/**
* Tests if user credentials are valid.
* @return boolean
* @throws Exception
*/
public function authenticate()
{
$xml = $this->httpRequest('http://twitter.com/account/verify_credentials.xml');
return (bool) $xml;
}



/**
* Sends message to the Twitter.
* @param string message encoded in UTF-8
* @return mixed ID on success or FALSE on failure
*/
public function send($message)
{
$xml = $this->httpRequest(
'https://twitter.com/statuses/update.xml',
array('status' => $message)
);
return $xml && $xml->id ? (string) $xml->id : FALSE;
}



/**
* Returns the 20 most recent statuses posted from you and your friends (optionally).
* Ammended to allow more status returns - max is 200
* @param bool with friends?
* @return SimpleXMLElement
* @throws Exception
*/
public function load($withFriends)
{
$line = $withFriends ? 'friends_timeline' : 'user_timeline';

$xml = $this->httpRequest("$this->serverapi/statuses/$line/$this->user.xml?count=$this->retrieve_count", FALSE);

if (!$xml || !$xml->status) {
throw new Exception('Cannot load channel.');
}
return $xml;
}



/**
* Process HTTP request.
* @param string URL
* @param array of post data (or FALSE = cached get)
* @return SimpleXMLElement|FALSE
*/
private function httpRequest($url, $post = NULL)
{
if ($post === FALSE && self::$cacheDir) {
$cacheFile = self::$cacheDir . '/twitter.' . md5($url) . '.xml';
if (@filemtime($cacheFile) + self::$cacheExpire > time()) {
return new SimpleXMLElement(@file_get_contents($cacheFile));
}
}

$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERPWD, "$this->user:$this->pass");
curl_setopt($curl, CURLOPT_HEADER, FALSE);
curl_setopt($curl, CURLOPT_TIMEOUT, 20);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($curl, CURLOPT_HTTPHEADER, array('Expect:'));

if ($post) {
curl_setopt($curl, CURLOPT_POST, TRUE);
curl_setopt($curl, CURLOPT_POSTFIELDS, $post);
}
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE); // no echo, just return result
$result = curl_exec($curl);
$ok = curl_errno($curl) === 0 && curl_getinfo($curl, CURLINFO_HTTP_CODE) === 200; // code 200 is required

if (!$ok) {
if (isset($cacheFile)) {
$result = @file_get_contents($cacheFile);
if (is_string($result)) {
return new SimpleXMLElement($result);
}
}
return FALSE;
}

if (isset($cacheFile)) {
file_put_contents($cacheFile, $result);
}

return new SimpleXMLElement($result);
}

}
?>

So now we need the script to use this, get the latest posts and sae them to your table.

require_once 'twitter.class.php';

// create a twitter account object change the first param to you username and the second to your password
$twitter = new Twitter('twitteraccountname', 'xxxxxxxxxx');
$twitter->set_retrieve_count(200);
$twitter->set_serverapi('http://twitter.com');

$withFriends = TRUE;
$channel = $twitter->load($withFriends);

include("dblib.php");

$link = openDB();

// hard coded elements - normally picked up from current user and server usage
$UserID = 1;
$ServerID = 1;
$UserHandle = 'twitteraccountname';

// IMPORTANT - need to get some validation to prevent injection attacks here



foreach ($channel->status as $status){
$TweetID = $status->id;
$TweetText = $status->text;
$TweetText = stripslashes($TweetText);
$TweetText = mysql_real_escape_string($TweetText);
$TweetName = $status->user->name;
$TweetScreenName = $status->user->screen_name;
$TweetPrflImgUrl = $status->user->profile_image_url;
$TweetCreated = date("Y-m-d H:i:s", strtotime($status->created_at));
$TweetURL = $status->user->profile_image_url;

$query = "Insert into UserSavedTweets values($UserID,$ServerID,'$UserHandle',$TweetID,'$TweetText','$TweetName','$TweetScreenName','$TweetPrflImgUrl','$TweetCreated','$TweetURL')";

$result = mysql_query( $query );

if ($result == TRUE){
echo "Inserted ".$TweetText."

";
} else {
echo "No Insertion $query

";
}
}

?>

And thats it for the script. Then on the server get the cron jobs up and get it running.

Obviously you would need to be able to have a way to access them but that is pretty straightforward standard php MySql stuff.