1. Battling the weaponised drones of popular opinion

    Posted November 3, 2014 in comment, politics, twitter  |  No Comments so far

    Kevin at Strange Attractor on what the tactics of Gamergaters tell us about the future of online political discourse.

    When you look at the techniques being used by some of these groups, you quickly get a sense of how the next partisan political scorched earth campaign will be fought. Sockpuppets will become the weaponised drones of popular opinion, amplifying marginal views so that they swamp mainstream opinion.

    The article’s broadly pessimistic tone resonates with me, but I feel more positive when thinking about the countermeasures that could protect against these tactics. The Eliza chatbot set up to engage these sock puppets is just one example, acting more as chaff—attracting attackers and wasting their resources—than as an offensive weapon.

    If ideas like that proliferate and evolve we might just escape a future where we have to abandon the web to swarms of hate-spewing bots.


  2. Using Google Spreadsheets to extract Twitter data

    Posted November 20, 2009 in How-to, twitter  |  28 Comments so far

    Last weekend I was looking for ways to extract Twitter search data in a structured, easily manageable format. The two APIs I was using (Twitter Search and Backtweets) were giving good results – but as a non-developer I couldn’t do much with the raw data they returned. Instead, I needed to get the data into a format like CSV or XLS.

    Some extensive googling led me to this extremely useful post on Labnol, where I learnt about how to use the ImportXML function in Google Spreadsheets. Before too long I’d cracked my problem. In this post I’m going to explain how you can do it too.

    Data you can extract from Twitter

    This walkthrough will teach you how to extract two types of Twitter data using Google Spreadsheets – tweets and links.

    Tweets are extracted using the Twitter Search API in conjunction with ImportFeed. This allows Twitter search results to be extracted into a spreadsheet format.

    Links are extracted using the Backtweets API in conjunction with ImportXML. The Backtweets API allows you to find any links posted on Twitter even if they’ve been shortened using services like bit.ly or tinyurl.

    I’m in a hurry, can I just do this right now?

    If you just want to do it – instead of learn how to do it – just open this Google spreadsheet I’ve created.  You’ll need to make your own local copy so you can edit it. Instructions can be found in the spreadsheet itself.

    How to extract tweets containing links

    The instructions below will help you create a Google Spreadsheet that pulls in and displays the time, username and text of all tweets containing links to a specified page. Because it uses Backtweets, these tweets will be retrieved even if they used shortened URLs from services like bit.ly or tinyurl.

    1. Create a new spreadsheet in Google Documents.
    2. Enter column labels in this order: “Search criteria”, “Timestamp”, “Username” and “Tweet text” in cells A1 to D1.
    3. In cell B2, underneath Timestamp, insert the following formula:
    4. =ImportXML("http://backtweets.com/search.xml?itemsperpage=100&since_id=1255588696&key=key&q="&A2,"//tweet_created_at")
    5. In cell C2, underneath Username, insert the following formula:
      =ImportXML("http://backtweets.com/search.xml?itemsperpage=100&since_id=1255588696&key=key&q="&A2,"//tweet_from_user")
    6. In cell D2, underneath Tweet Text, insert the following formula:
      =ImportXML("http://backtweets.com/search.xml?itemsperpage=100&since_id=1255588696&key=key&q="&A2,"//tweet_text")
    7. Now paste a search query into cell A2 – say, http://www.google.com. After a few seconds, you should see columns B, C and D fill up with tweets, looking something like the image below:
    8. Google Spreadsheet showing Backtweets results

    9. The formulas pasted into cells B2, C2 and D2 all reference the URL in cell A2. This means that whenever you paste anything new into A2, the search results should refresh.
    10. Also, you can paste parts of URLs into A2 – not just entire ones. This is useful for seeing all links to a specific directory on your site, for example.

    Finally, this tool can only extract 100 results at a time – but it is possible to set it up to retrieve more than that. Look at my sample Google Spreadsheet if you want to do this.

    Extracting tweets from Twitter search results

    The method for doing this is identical to the above, but uses the ImportFeed function instead of ImportXML.

    1. Create a new spreadsheet in Google Documents.
    2. Enter column labels in this order: “Search criteria”, “Timestamp”, “Username” and “Tweet text”. For the rest of this walkthrough, I’m going to assume that these labels are in cells A1 to D1, but in reality you can put them wherever you like
    3. In cell B2, underneath Timestamp, insert the following formula:
      =ImportFeed("http://search.twitter.com/search.atom?rpp=20&page=1&q="&A2, "items created")
    4. In cell C2, underneath Username, insert the following formula:
      =ImportFeed("http://search.twitter.com/search.atom?rpp=20&page=1&q="&A2, "items author")
    5. In cell D2, underneath Tweet Text, insert the following formula:
      =ImportFeed("http://search.twitter.com/search.atom?rpp=20&page=1&q="&A2, "items title")

    6. Type a search query into cell A2 - say, "Hoth." Hit enter and the results will load. It should look something like this:
    7. Google Spreadsheets with data from Twitter searchThings will go wrong if you insert characters like # or @ into the search query. To get around this, type %23 instead of # and %40 instead of @. This will allow you to search for hash tags and usernames.

    I haven't been successful in generating more than 20 search results per request, but you can get around this using the page number parameter in the ImportFeed query string. See my own Google spreadsheet to find out how to do this.

    I hope these instructions are useful - if you have any comments, questions or feedback, please let me know in the comments.