Last weekend I posted a chart of Porter Novelli Twitter folk and their followers. If you read it, you’ll recall that I was dissatisfied by what it implied about the collective reach of Porter Novelli twitterers.
Well, thanks to a long-ish train journey to Bolton and back, I was able to fudge a little perl script together to look through the data to find and remove everything other than the first instance of a follower. Let’s make that a little clearer. Let’s say that we’re looking at three Twitter people, Alice, Bob, and Carol. The first thing to do is to see who follows them:
| alice | bob | carol |
| bob carol dave xerxes yasmine zeus |
alice carol edward william xerxes yasmine zeus |
alice bob frank william xerxes |
Now we need to rank them in order of “who has the most followers” (also known as “popularity” as it happens). Here I’ve done that from left to right. Bob has the most followers and Carol the fewest.
| bob | alice | carol |
| alice carol edward william xerxes yasmine zeus |
bob carol dave xerxes yasmine zeus |
alice bob frank william xerxes |
And finally we go through from left to right removing all followers who have already shown up on someone else’s list.
| bob | alice | carol |
| alice carol edward william xerxes yasmine zeus |
bob dave |
frank |
Bob, being at the top of the list gets to keep all his followers which may seem unfair. But it’s not unfair if the question we’re trying to answer is “how do I reach as many people as possible by speaking to as few people as possible?” That is, I’m looking for reach (marketing people often express themselves in terms of “reach” — or the number of people who are exposed to a message — and “frequency” — or the number of times the average person is exposed to that message.)
Looking at the example above, we can see that Alice really delivers an incremental benefit of two new people, and Carol only reaches one new person. That gives us a much better idea of how valuable the most popular person (Bob) really is.
Applying this to the Porter Novelli data set
Clearly it would be extraordinarily boring to perform the process described above for the 205 people in the Porter Novelli data set that I want to analyse. But the analysis script that I wrote (with plenty of help from the perl monks) goes through exactly these steps. It’s a pretty straightforward job, ranking and deduping. Here’s what we get.
This makes much more sense than the last run. According to the Pareto principle, roughly 80% of the effects should come from 20% of the causes. Here we see that 20% of the Porter Novelli Twitter users (marked in black) account for slightly more than 80% of the reach (marked in red.) It’s pretty much a text-book example. Things are as they should be, I suppose.
More to the point, we can now assign appropriate value to coverage at the head of the graph. This is of great value when thinking about our media planning and engagement
By the way — if you’d like a copy of either the Twitter follower API query engine (it’s a well-behaved command-line thing that was developed by the excellent Joachim Larsen) or the slightly shonky perl script that I wrote on the train, you have only to ask: I’ll be pleased to share. Send me a tweet at @mediaczar and I’ll send you the scripts.


@mweller you thought about looking at what the marketeers are doing re: spread, reach, growth, etc? eg from @mediaczar http://is.gd/hJqK
Thanks for the interesting analyses, breaking down one (large) company like this. Makes me think back to when I interviewed with them about a year ago…they should’ve grabbed me while they had the chance!! Ha ha.
Thanks for the support, Mark. Good to meet you, and look forward to talking next week.
Your comment is awaiting moderation.