Bot or Not?
Turing had the Imitation Game, Phillip K. Dick had the Voight-Kampff Test, and spammers gave us the CAPTCHA. The Truthy project at Indiana University has BotOrNot, which is supposed to distinguish real people on Twitter from automated accounts, ‘bots’, using analysis of their language, their social networks, and their retweeting behaviour. BotOrNot seems to sort of work, but not as well as you might expect.
@NZquake, a very obvious bot that tweets earthquake information from GeoNet, is rated at an 18% chance of being a bot. Siouxsie Wiles, for whom there is pretty strong evidence of existence as a real person, has a 29% chance of being a bot. I’ve got a 37% chance, the same as @fly_papers, which is a bot that tweets the titles of research papers about fruit flies, and slightly higher than @statschat, the bot that tweets StatsChat post links, or @redscarebot, which replies to tweets that include ‘communist’ or ‘socialist’. Other people at a similar probability include Winston Peters, Metiria Turei, and Nicola Gaston (President of the NZ Association of Scientists).
PicPedant, the twitter account of the tireless Paulo Ordoveza, who debunks fake photos and provides origins for uncredited ones, rates at 44% bot probability, but obviously isn’t. Ben Atkinson, a Canadian economist and StatsChat reader, has a 51% probability, and our only Prime Minister (or his twitterwallah), @johnkeypm, has a 60% probability.
Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »
Funny, my papers bot @phydyn_papers has a 33% bot probability while my own account, @phydyn, is 64% bot.
10 years ago
Verily it is written “Go away or I will replace you with a very small shell script.”
10 years ago
I looked into this a little while ago, and at the time thought having the word “bot” in the account name seemed to be a much better indicator of a true positive than any content analysis I could manage. Posting frequency patterns also seemed useful with the data I was looking at.
10 years ago