Twitter: there really is *a lot* of spam

We’ve been working on our noise filters at Viewsflow, hunting down ‘bad’ Twitter accounts the look good. This afternoon’s exercise left me relatively surprised at how bot-rich Twitter is. Looking at a couple of hundred accounts which were regularly sharing high quality links, we found these (approximate) proportions:

Now I figured there were a lot of bots on Twitter but what is clear that there are a lot of bots on Twitter, reposting content from feeds of various sites. Broadly speaking there appeared to be three different types

  1. Republishing feeds which slice and dice other feeds in, presumably, interesting ways – make up a tiny minority; one might assume their is a segment of the audience interested in them.
  2. Out-and-out megaphones: Twitter accounts which pull off reliable sources, like the WSJ or FT rss feed, and ping meticulously on the hour
  3. ‘Social media experts’: You can tell by their twitter backgrounds (with the phone numbers on the left), who appear to be part of some mutual back linking exercise. They are 90% retweet, some smartly trying to retweet what looks like a Google Alerts feed, and 10% ‘thanks for following’.

Twitter is spotting rogue accounts. A couple of accounts (not shown in the pie chart had been suspended by Twitter).
The most popular Twitter clients were for the bot accounts were twitterfeed and Hootsuite.

At one level I might ask ‘Who am I to judge whether a Twitter stream is worth of publishing or not’? But without making that judgement it raises a few interesting observations.

  1. The cost of creating links in the social graph is so low that these links may be decreasingly valuable: In the same way the link-farms found ways to attack the core value of PageRank. Acquiring followers is really easy. Create twitter bots that echo what you say is easy. And little cost is imposed to those who do. Many of the services we might use to evaluate the impact of a piece of content (like Tweetmeme or Topsy) return gross popularity measures, rather than a qualitative or contextual assessment.
  2. The cost of creating messages is also incredibly low–and the noise is public, which is why the search for Twitter search is hotting up.
  3. The low cost of creating lists is making lists very noisy, as well.
  4. Twitter link-farms must be spreading rapidly. The cost of creating a Twitter account is low — heading towards 48 cents — and can be amortised over a life-time of pump-and-dump scams.
  5. How much electricity goes into the frequent repeating of noisy content; across 10s millions accounts?

When we do more robust analysis of what we’ll find, we’ll share in due course.

Popularity: 23% [?]


blog comments powered by Disqus