1. Google launches Google Correlate, a new tool to support search trend analysis

    Posted May 25, 2011 in research  |  1 Comment so far

    Yesterday I wrote about this Twitter-based hedge fund, and connected it to the broader area of large-scale online analytics being used to anticipate real-world events. And today Google has announced a new tool, Google Correlate, which has been built to do just that.

    When I was dabbling in this area with search data and unemployment statistics I was using Google Insights, which made the process pretty long-winded – it produced a lot of messy data which only became useful after a few hours of macro-writing in Excel. So it was encouraging to read, in Google’s official post about Correlate, that:

    [T]ools… such as Google Trends or Google Insights for Search weren’t designed with this type of research in mind. Those systems allow you to enter a search term and see the trend; but researchers told us they want to enter the trend of some real world activity and see which search terms best match that trend… This is now possible with Google Correlate, which we’re launching today on Google Labs.

    I’m looking forward to giving Google Correlate a try, from what I’ve read it seems like it still only represents the tip of a very big iceberg, a glimpse through a keyhole into a big world of data that only Google is allowed to explore. Hopefully I’m wrong and it does go deeper than that though. I’ll post more about it when I’ve had a chance to look around…


  2. A hedge fund based on Twitter may not be as stupid as it sounds

    Posted May 24, 2011 in comment  |  No Comments so far

    Using online analytics and social media trends to predict real-world events is nothing new. Twitter’s been used to predict box-office sales (story link, detailed paper) and Google search data has been telling us about future flu epidemics for a while now.

    Even I got in the act, demonstrating back in 2009 that Google Insights could anticipate changes in UK unemployment figures.

    Financial difficulties searches versus unemployment, until April 2009

    UK unemployment rate charted against search volumes for 24 related keywords, from January 2004 to April 2009 Sources: Office for National Statistics, Google Insights

    Maybe I should have followed through with that idea, because there’s now a hedge fund that bases its investment decisions on data from Twitter. It’s called Derwent Capital Markets, it opened for business last week, and if its managers end up making a mint there might well be a new bandwagon in town.

    So how do you run a hedge fund based on tweets? From what I understand of Derwent’s methodology, their algorithms measure the “calmness” of the Twittersphere – presumably based on sentiment analysis, which I’m a bit skeptical about. This is used to estimate the volatility of the Dow Jones Industrial Average index, with a three-day time lag.

    This leaves a lot of unanswered questions. Does a non-calm day of Twitter conversations always correspond to a drop in the DJIA, or just volatility? Are they trying to predict metrics like trade volume and so on as well as broader day-to-day movements in the overall index? And are they ranking Twitter users based on credibility, or are spam bots equal to financial journalists, economists, and prominent investors?

    Obviously algorithmic hedge funds aren’t about to disclose their inner workings so questions like this will have to remain unanswered for now. But what of the other, larger, question – isn’t the whole idea just, well, a bit… silly?

    I can see why people might react in this way, and even I feel a bit skeptical about something describing itself as a “social media-based hedge fund” and that apparently pulls data only from Twitter, when there are lots of other sources that could be tapped. But it would be wrong to dismiss the basic concept.

    Our everyday activities – web searches, page views, purchases, things we say on open social networks – leave a trail of data behind, which we tend to see as ephemeral or throwaway. We severely underestimate the value of this data but Google doesn’t, Facebook doesn’t, and we shouldn’t either. This data becomes even more valuable when aggregated across entire countries, continents, or the planet as a whole. In fact, it could be argued that the predictive potential of aggregated global real-time data has yet to be fully imagined, let alone realised.

    The biggest problem with this resource is that we don’t really know how to exploit it yet. Things like Google Flu Trends or this Twitter-based hedge fund may be crude and experimental, and will definitely look even more so in five years time. Along the way there will be hype, bandwagonism, maybe even a stock market bubble, resulting from the application of real-time data to real-world problems.

    But we need to make a start somewhere, and as silly as a Twitter-based hedge fund might sound, it’s as good a place to begin as any.


  3. How recruiters are posing a threat to LinkedIn even though they don’t mean to

    Posted May 17, 2011 in comment, social media  |  4 Comments so far

    One of LinkedIn’s strengths is its “how you’re connected” feature, which shows how you’re linked to second degree contacts. Seeing who you have in common with someone helps you understand who they are, what they’re like, and whether it’s worth getting to know them. It’s often more informative than the blurbs people write about themselves.

    LinkedIn's "how you're connected" feature

    "Any friend of Joe's is a friend of mine"

    But this LinkedIn feature is becoming less useful due to an insidious form of network pollution. Like coastal erosion, this network pollution is a slow process that’s barely noticeable from one day to the next, but could be hugely damaging in the longer term. And I think I know who’s responsible for this network pollution – recruiters.

    Before I continue, I should say that this isn’t an anti-recruiter rant. Recruiters may be responsible for this network pollution, but the blame lies with LinkedIn, and I’ll talk more about this later. Building a big contact list is essential to a recruiter’s job and they can’t be expected not to do this. But this is what’s weakening the value of LinkedIn’s “how you’re connected” feature, and quite possibly its network as a whole.

    If you’re a LinkedIn user, you’re not just a person – you’re a “node”, which is a fancy way of saying that you can connect people to one another. If one of your contacts finds another one of your contacts on LinkedIn, you will be the node that connects them. And as a connecting node, your usefulness comes from the quality of your relationships with those two individuals. If the person searching knows that you’re picky about who you connect with (which you clearly are, only highly discerning people read this blog after all), your connection to that person is itself a notable endorsement.

    Network diagram based on The Wire

    If you were Marlo, you'd probably be more interested in people you knew through Prop Joe than through McNulty

    Not every “node” on LinkedIn is as discerning and useful as you are, though. Some nodes are far more promiscuous, connecting to lots of people they’ve never met, let alone worked with, and the more promiscuous someone is the less useful they become as a LinkedIn node. This is where recruiters come in. They hoover up connections, which means that you often find your second degree contacts are connected to you through recruiters. But as connecting nodes, the recruiters aren’t all that useful because they’re not very choosy about who they connect with.

    Bubbles causes network pollution

    Bubbles pollutes Marlo's network because he knows so many people. Now everyone's a second degree contact

    OK, maybe I’m stretching the analogy by comparing Bubbles to a recruiter, so I’ll drop it now. The general principle is that, if you’re connected to more than a couple of recruiters, searching LinkedIn will turn up more and more people who are second degree contacts, but that you only know through recruiters. The value of someone being a second degree contact slowly declines, because when a recruiter is the common contact you learn nothing more meaningful than that you both once looked for a job, or once tried to hire people.

    It’s like sharing a mild dislike of rain – common ground, yes, but not very meaningful. This is what I mean by “network pollution”. The value or interestingness of the network is dropping because of recruiters and other “super-nodes” who are turning nearly everybody into your second degree contacts.

    LinkedIn isn’t the only service susceptible to this kind of network pollution. Twitter will sometimes recommend another user to you because you have a “follow” in common. And if that “follow” is, say, your best friend, that’s good grounds for a recommendation. But if the common follow is Stephen Fry, Barack Obama, or any other celebrity account with millions of followers, that’s pretty useless. If Last.fm recommended someone to you because you both listened to the Beatles, that would be pretty useless too (which is why music recommendation algorithms are hard to get right). All social networks have to deal with problems like this where “super-nodes” undermine the value of recommendations based on shared connections.

    So as I said earlier, this is a LinkedIn issue and not the fault of recruiters who are simply trying to do their jobs. Recruiters will continue to add connections, other people will continue to accept them, and the usefulness of “how you’re connected” will continue to drop. It’s not a very serious problem right now, but LinkedIn needs to think of how it can design for this aspect of its social graph, which is something it seems to take pretty seriously – and rightly so.


  4. The Penguin Pool at London Zoo – I liked it more than the penguins did

    Posted May 8, 2011 in Diary, Photos  |  No Comments so far

    Last weekend I went to London Zoo for the first time. The thing I liked most – apart from the animals obviously – was the Penguin Pool.

    Penguin Pool outside photo

    You can tell from the typeface that it's going to be good

    The Penguin Pool was created in the 1930s by Berthold Lubetkin and Ove Arup. It’s a masterpiece of modernist architecture, but the penguins don’t live there any more. They were evicted in 2004 amid concerns that waddling around on reinforced concrete was hurting their joints.

    Photo of inside the Penguin Pool

    To be fair it doesn't look like an ideal penguin habitat

    I was transfixed by the Penguin Pool. The intensity of light, the curved white space, the bold double helix in the centre: I didn’t know what to do with the space, but I had a strong urge to go in there and use it somehow. Obviously the penguins didn’t feel the same way. I guess me and penguins don’t see eye to eye on everything after all.

    Another shot inside the Penguin Pool

    It's not easy to burrow in concrete

    JG Ballard’s landscapes of broken suburban landscapes being reappropriated by nature came to mind when I gazed into the Penguin Pool. Crystal-shelled armadillos crawling along the floors of long-empty swimming pools, that sort of thing.

    Sometimes architecture serves a purpose, sometimes it doesn’t. Like the brutalist Elephant House, another listed structure at London Zoo that no longer houses its original tenants, the Penguin Pool failed to accommodate the needs of penguins just as Le Corbusier’s grand aesthetic failed to address the problems of human cities.

    But this doesn’t detract from the beauty and impact these works can retain. For me, the Penguin Pool’s only failing is that the creatures it was really designed for just haven’t been invented yet.


  5. How to make page titles work properly in Now Reading Reloaded

    Posted May 7, 2011 in How-to  |  9 Comments so far

    I use a plugin called Now Reading Reloaded for the Library section of this site, and very good it is too.

    Unfortunately the developer behind the plugin doesn’t have time to maintain it any more. This means that some things have started to break in newer versions of WordPress, particularly the way the plugin works with page titles. If you’re using Now Reading Reloaded and the title of a book isn’t showing up in your <title> tag, this post should help you fix it.

    The problem is with the file now-reading.php, which you’ll find in the plugin folder. Open this file in a text editor and find the function called nr_page_title(). It starts with the following line:

    function nr_page_title( $title ) {

    This is the function that tells WordPress what to put in the page title, and if you’re having the problem I was having this is where the blame probably lies. To solve the problem you’ll need to replace this function with the one that appears below.

    function nr_page_title( $title ) {
        global $wp, $wp_query, $wpdb;
        $wp->parse_request();
    
        $title = '';
    
        if ( get_query_var('now_reading_library') )
            $title = 'Library';
    
        if ( get_query_var('now_reading_tag') )
            $title = 'Books tagged with “' . htmlentities(get_query_var('now_reading_tag'), ENT_QUOTES, 'UTF-8') . '”';
    
    	if ( get_query_var('now_reading_author') ) {
    		$author = $wpdb->escape(urldecode(get_query_var('now_reading_author')));
            $author = $wpdb->get_var("SELECT b_author FROM {$wpdb->prefix}now_reading WHERE b_nice_author = '$author'");
    		$title = 'Books by ' . $author;
    	}
    
        if ( get_query_var('now_reading_search') )
            $title = 'Library Search';
    
    	if ( get_query_var('now_reading_title') ) {
            $esc_nice_title = $wpdb->escape(urldecode(get_query_var('now_reading_title')));
            $book = get_book($wpdb->get_var("SELECT b_id FROM {$wpdb->prefix}now_reading WHERE b_nice_title = '$esc_nice_title'"));
    		$title = $book->title . ' by ' . $book->author;
    	}
    
        if ( !empty($title) ) {
            $title = apply_filters('now_reading_page_title', $title);
            $separator = apply_filters('now_reading_page_title_separator', ' » ');
            return $title.$separator;
        }
        return '';
    
    }
    

    Copy this function and paste it into now-reading.php. To be on the safe side, take a backup of now-reading.php before you do this. Make sure you paste over the entire function – no more, no less.

    If all goes well your page titles should now work if you’re using WordPress 3.0 and above (and if you aren’t, you need to update WordPress right now). If you have any questions, let me know in the comments.

    Note: I was only able to find this fix by reviewing the code of Now Watching, an actively maintained fork of Now Reading Reloaded which handles movies instead of books. The author of Now Watching is Zack Ajmal and he deserves more credit than I do for this fix!


  6. Job Vacancy: Head of Dilemmas

    Posted May 2, 2011 in ephemera  |  2 Comments so far

    Leading online portal brelson.com is currently recruiting a Head of Dilemmas.

    The Head of Dilemmas is responsible for resolving all problems where two or more possible solutions exist, but none are practically acceptable. A proven track record in horn resolution, rock/hard-place avoidance and strategic rumination is essential, while experience of handling trilemmas and paradoxes is advantageous but not necessary.

    As the Head of Dilemmas you will report to the Director of Indecision and will work closely alongside the Head of Quandaries as well as smaller teams focusing on Riddles, Pickles, Stumpers and Crises.

    Cartoon of Head of Dilemmas

    Yet another winning presentation from the Head of Dilemmas to the brelson.com board. Could this be you in the hot seat?

    The successful applicant will be able to demonstrate a wide range of dilemma-handling techniques from formal logical analysis to hand-wringing and procrastination. Along with your application please submit a short (100 words) commentary on how you would apply these techniques to one of the following classical dilemmas:

    This is an important hire for us due to the high volume of mission-critical dilemmas faced by our organisation. We look forward to hearing from you.