If you’re going to follow one Twitter person…
February 14th, 2009
Can I please suggest that — if you’re looking for fresh new tweeple to follow — that you kindly consider @BriggySmalls?
Thank you for your attention. That is all.
Posted in Uncategorized | 1 Comment »
Can we calculate party affiliation? (the US Congress Edition)
February 13th, 2009
Using nothing more than their public twitter relationships, is it possible to predict whether a US Congressperson is a Republican or a Democrat? The answer seems to be a guarded “yes” — our tools predict correctly 40/46 times (or around 87% of the cases.)
This post follows on from a post earlier today in which I asked, “can we calculate party affiliation?” The data set in the earlier post was gathered from the 16 members of the UK parliament who are on Twitter and the relationships between them.
Tweetcongress maintains a list of US congresspeople on Twitter. Today (February 13, 2009) there are 76 congresspeople on the service, but when I collected my data set of “who follows who” on February 3, 2009 there were only 65. Of these 65, fully 19 (29%) lived a life of noble isolation with regards the network — none of their peers linked to them, and they in turn linked to none of their peers. Removing these Miss Havishams from the data set leaves me with 46 twittering congresspeople who form a network.
Now as both social network analysis and Aesop would have it, “a man is known by the company he keeps.” What I mean by this is that given the partisan nature of politics, we should expect that Democrats will link to other Democrat twitterers more often than they link to Republican twitterers and vice versa. So that’s what NetDraw[1] , the software I’m using for most of this stuff, looks for, or more accurately:
To identify factions, NetDraw software iteratively searches for a distribution of nodes among a selected number of factions to minimise the number of connections between factions and to maximize the number of connections within factions.
Whatever. So I let NetDraw loose on the data, and here’s what it did.
I coloured the nodes red for Republican and blue for Democrats[2], labeled the nodes by party (for the sake of clarity, and for the hard-of-thinking, that’s “R” for Republican and “D” for Democrat) then counted all the nodes where label said one thing but colour another. There were six of these nodes; so NetDraw got the answer right 40⁄46 of the time (just about 87%.) This is less than the astonishing 93.75% accuracy we got with the Westminster twittering members of parliament in the previous post. Nevertheless I think we can safely say that it’s not a particularly integrated (or bipartisan) network if we can predict party affiliation with quite such success.
Here’s exactly the same map with the errant sheep re-labeled with their proper names so it’ll be easier to refer to them (if it helps, you can click on the image to view or download a larger version.)
You’ll see, I hope, that NetDraw has made a pretty good fist of the job. Where it has gone wrong on the whole is where the data clearly suggests something else. So Rep. Jared Polis for instance follows (and is followed by) no Democrat peers. Rep. Nancy Pelosi (D) and Sen. Richard Durbin (D) follow each other, but since Pelosi is followed by several Republicans and none of her other Democrat peers you can see why the algorithm has made the incorrect guess that the two of them are Republicans. Long-serving member Neil Abercrombie, as discussed in a previous post on US Congress Twitter folk, forms a bit of a bridge between the two parties, so despite his membership of the Congressional Progressive Caucus and liberal voting record, from the Twitter network point of view, his affiliation is somewhat ambiguous.
Sen. McCain follows none of his peers, and appears to inherit his incorrect attribution from Sen. Susan Collins. For the life of me, I can’t work out what makes it think that Sen. Susan Collins is a Democrat. She really isn’t, you know.
Note 1: NetDraw is a free program written by Steve Borgatti from the University of Kentucky. If you’re interested in playing around with this stuff, you’ll need to get yourself a copy.
Note 2: Actually, that’s not true. Despite a friend sharing the simple mnemonic that “‘Republicans’ and ‘red’ begin with the same letter,” I just can’t get it out of my English head that the Republicans should be blue and the Democrats red. As a result I waste precious minutes re-colouring these maps in Illustrator. It is worth pointing out that I also have problems with “left” and “right” on occasion — preferring instead the binary opposition “left” and “No! no! The other left, for God’s sake!”
Tags: congress, democrat, gop, jared polis, john mccain, mapping, nancy pelosi, Neil Abercrombie, network analysis, networks, republican, research, richard durbin, susan collins, twitter
Posted in networks, research, twitter | 1 Comment »
Creating blog seed lists for research
February 10th, 2009
Colleagues and regular readers will know that we’ve been working on an “online influencer mapping” tool called Rufus. Those of you who’ve had a chance to use Rufus will know that it requires a seed list of URLs to get started. Creating this seed list can be automated in one or two ways, but one of the fastest, most effective, and most sensible ways to build a seed list is still to do it by hand.
We’ve got one or two other processes that also require us to build a seed list. No doubt other people do too — lots of web research is quite data hungry. So — because I’ve found myself telling a few of my Porter Novelli colleagues how we go about the process, I thought I’d share it here, in the interests of:
- having somewhere to point people in future,
- general good-heartedness: I’ve learned a lot from people in the past, and I like to give stuff back, and
- getting feedback and tips from people about how they might go about the same process.
Oh – and while these methods should work in any language, please bear in mind that I tend to think and work in English. I’d appreciate feedback on how best to localize these methods.
Building a seed list: 5 easy methods
With all these methods, there’s no substitute for checking out the blog. I don’t ask people to read the blog (that comes at a different stage of the process altogether) but you should at least click through and see what you’re dealing with. In fact, method 3 rather relies on you visiting the blogs you’re researching.
1. Look for someone who has already done your research for you
Start by being optimistic. Generally you’ll find that someone else has created a list of the “top ten” (or however many) blogs in the niche that interests you. Take a look at Brendan’s regularly updated PR Friendly Index for example. If you’re searching for English language blogs then you could do worse than start by looking at Guy Kawasaki’s Alltop. But simply Googling for lists of blogs or blog charts should get you a long way.
This is generally a source of fairly high-quality data. One thing to watch out for, though, are search engine spamming link farms, and shady “Make Money Online” (MMO) directories. You’ll learn to recognize these soon enough, but as long as you’re visiting all the blogs you’re putting on your seed list you should be alright.
2. Do a tag search on delicious
I picked up this technique from Anthony Mayfield, who showed me that by searching on the delicious social bookmarking site for the tags “xxx” and “cool” and/or “inspiration” you could find sites about “xxx” that people thought were cool. Knowing what your digital trendsetters think is cool is one hell of an insight.
For our purposes though, we’re looking for cool blogs. So (1) click the “Explore Tags” tab on the home page, and then (2) type your keyword and the word “blog” into the search box. Couldn’t be simpler?
Well — actually it could be simpler. You can query the delicious database when you type the URL into the address bar of your browser like this:
http://delicious.com/tag/blog+keyword
Where “keyword” is the word you’re looking for.
When you get the results, check the ones that (a) have the right kind of title (if you’re looking for French blogs, look for French titles for example), (b) have the right kind of tags and description and (c) have been bookmarked most often
If there’s a better local language social bookmarking site, I’d use that whenever possible. For example, Mister Wong is a good one for German language sites.
A quick note: social news sites like Digg and Reddit, and “serendipity browsers” like StumbleUpon tend not to work so well in my experience.
This method also owes a lot to Marshall Kirkpatrick. You might like to try out the Yahoo! Pipe that I built based on the process that Marshall documents.
3. Look for blog rolls
On every blog you visit during the research process, look for the blog roll — and check the likely-looking links. See if they’re useful or useless. Quite often you’ll find that someone who has an interest in widgets will also read and link to blogs that cover widgets. That, after all, is the principle on which Rufus works wrote small. So we reckon it’s a pretty good approach.
4. Ask your Twitter followers
Seriously — this works. Well — it worked for me and my team from around +100 followers onwards. I’d be interested in others’ experience.
5. Call someone
Get hold of someone who knows about the subject and phone them up or get them on IM. Category experts are an excellent source of low-volume but high-quality information. It’s time consuming, but can work well if you have the right contacts. Journalist friends might be a great source of blog lists.
I’ve purposefully left this one till last; I think it’s a good rule of thumb to do your desk research before picking up the phone. That way you can ask intelligent questions instead of damn fool ones.
Using a text editor
I try to keep two lists running all the time that I’m working; a scratchpad list of blogs I have yet to visit and the seed list itself. Because I’m on a Mac, I use the excellent BBEdit (there’s a free version called TextWrangler which will be just as good for most people.) If — as is more probable — you’re on a Windows machine, you might like to try the very powerful but slightly less pretty Notepad++. But if you just want to use Excel, though, that’s fine, too.
Tags: bbedit, bloggers, blogs, delicious, digg, notepad++, reddit, research, seed list, textwrangler
Posted in research | 4 Comments »
Republicans vs. Democrats: Pareto charts of unduplicated Twitter reach
February 8th, 2009
A couple of days ago I did a little more analysis on Republican and Democratic Congresspeople on Twitter.
Towards the end of the post, I realized that the unduplicated reach pareto chart that I’d built would only make sense if the US were a one-party state (or to be fair, if both parties had a single issue that they were united in wanting to promote.)
So — wanting to make this a little more representative — I went back and produced two charts; one showing Republican unduplicated reach (which follows a typical 80:20 distribution)…
Tags: congress
Posted in twitter | 3 Comments »
Republicans still outperforming Democrats on TweetCongress
February 4th, 2009
Three weeks ago (and at the prompting of my colleague Eddie Garrett who heads up Porter Novelli DC’s digital team) I mapped out the interconnections between US Congress Tweeters. We’d been working on a Twitter crawler and it seemed like a good opportunity to test things out on a new data set.
This is a follow-up post. Once again it was prompted by a third party: Christie Findlay at Politics Magazine asked whether it would be OK to print a copy of one of the maps in their March edition. I’ve heard that three weeks are a long time in politics, so I thought I’d better run the crawl again just in case. Also I’ve got a new crawler that uses the proper Twitter API (I can see some of your eyes glazing over you know. Just skip ahead when that happens.) I’d tried it out on the Porter Novelli data set, but welcomed a chance to try it on something more meaty.
So yesterday morning before work I ran the crawl. I use the excellent Tweet Congress as my source of information about which congress people are on Twitter.
Read the rest of this entry »
Tags: congress, mapping, network analysis, twitter, visualization
Posted in networks, twitter | 9 Comments »
Pareto Novelli — Some Q&As
February 1st, 2009
A recent post about some Pareto analysis of the Porter Novelli Twitter sample , “Porter Novelli Twitter folk – the 80/20 rule”, stirred up a little bit of interest on Twitter — and made me think again about what I’m doing and why. Partly because those conversations were off-blog (and I’d like to capture the answers I gave somewhere more permanent) and partly because I’ve now had time to think of better answers I thought I’d set them down here.
First, a little background. This Q&A is the sixth post in an impromptu series about the Twitter people where I work (Porter Novelli, the international public relations agency.) By now you might think that I’d be tired of this stuff, but you’d have another think coming. Here’s a quick list to bring you up to date.
- Map of Porter Novelli people on Twitter on 17th Jan 2008
- Map of Porter Novelli people on Twitter on 20th Jan 2008
- Introducing the Porter Novelli magic Twitter friend maker (beta)
- Porter Novelli Twitter folk ranked by number of followers
- Porter Novelli Twitter folk – the 80/20 rule
Looking at this, you might also think I clearly had nothing better to do than analyze Porter Novelli people and their Twittering ways. In fact, as an experimental data set, I couldn’t really ask for anything much better. It’s sufficiently large (more than 200 people), international (I’ve counted more than 10 countries — and I’m sure there are more), and I have some real-world access to all of the people in the sample, which means I can compare my findings with some hard data.
That said, the experiment is more about learning about how we can analyze Twitter networks — about discovering how representative they are as a word-of-mouth (WOM) channel for example, and what they can tell us about other kinds of social network, or about finding new ways to analyze such data sets — than it is about answering any specific questions. So I’ve not got any carefully mapped-out research plan. Instead I follow paths that strike me as interesting, or possible, or that are suggested to me by friends and readers.
Question 1
Tags: pareto, porter novelli, twitter
Posted in porter novelli, twitter | 7 Comments »
Poll: which new favicon?
January 31st, 2009
For those of you who are wondering what a favicon is, here’s a quick explanation.
Some websites have a little picture that is displayed in your browser’s address bar next to the URL. If you look at your bookmarks list, you’ll probably see a whole collection of these. These are favicons (pronounced fav-eyecons.) The intention behind them is partly ambient branding, but mostly improved usability — your eye will spot the icon in a list of browser bookmarks (called “favorites” by Internet Explorer – hence “favicon”) much faster than it will a string of text.
Victoria was so irritated by my homemade favicon (she tells me that she “cannot keep looking at your head cropped that way…”) that she has just sent me two new ones. I’ve installed one of them, but can’t be sure I’ve chosen the better of the two. So please take a look at the following and tell me what you think.
Sunday February 1, 2009: I’d say it looks like it’s going the way of the initial — while I’ll keep the poll open for the rest of the week and make appropriate changes — I’ve now switched to the “m”
Posted in poll | 5 Comments »
Porter Novelli Twitter folk – the 80/20 rule
January 29th, 2009
Last weekend I posted a chart of Porter Novelli Twitter folk and their followers. If you read it, you’ll recall that I was dissatisfied by what it implied about the collective reach of Porter Novelli twitterers.
Well, thanks to a long-ish train journey to Bolton and back, I was able to fudge a little perl script together to look through the data to find and remove everything other than the first instance of a follower. Let’s make that a little clearer. Let’s say that we’re looking at three Twitter people, Alice, Bob, and Carol. The first thing to do is to see who follows them:
| alice | bob | carol |
| bob carol dave xerxes yasmine zeus |
alice carol edward william xerxes yasmine zeus |
alice bob frank william xerxes |
Now we need to rank them in order of “who has the most followers” (also known as “popularity” as it happens). Here I’ve done that from left to right. Bob has the most followers and Carol the fewest.
| bob | alice | carol |
| alice carol edward william xerxes yasmine zeus |
bob carol dave xerxes yasmine zeus |
alice bob frank william xerxes |
And finally we go through from left to right removing all followers who have already shown up on someone else’s list.
| bob | alice | carol |
| alice carol edward william xerxes yasmine zeus |
bob dave |
frank |
Bob, being at the top of the list gets to keep all his followers which may seem unfair. But it’s not unfair if the question we’re trying to answer is “how do I reach as many people as possible by speaking to as few people as possible?” That is, I’m looking for reach (marketing people often express themselves in terms of “reach” — or the number of people who are exposed to a message — and “frequency” — or the number of times the average person is exposed to that message.)
Looking at the example above, we can see that Alice really delivers an incremental benefit of two new people, and Carol only reaches one new person. That gives us a much better idea of how valuable the most popular person (Bob) really is.
Applying this to the Porter Novelli data set
Clearly it would be extraordinarily boring to perform the process described above for the 205 people in the Porter Novelli data set that I want to analyse. But the analysis script that I wrote (with plenty of help from the perl monks) goes through exactly these steps. It’s a pretty straightforward job, ranking and deduping. Here’s what we get.
This makes much more sense than the last run. According to the Pareto principle, roughly 80% of the effects should come from 20% of the causes. Here we see that 20% of the Porter Novelli Twitter users (marked in black) account for slightly more than 80% of the reach (marked in red.) It’s pretty much a text-book example. Things are as they should be, I suppose.
More to the point, we can now assign appropriate value to coverage at the head of the graph. This is of great value when thinking about our media planning and engagement
By the way — if you’d like a copy of either the Twitter follower API query engine (it’s a well-behaved command-line thing that was developed by the excellent Joachim Larsen) or the slightly shonky perl script that I wrote on the train, you have only to ask: I’ll be pleased to share. Send me a tweet at @mediaczar and I’ll send you the scripts.
Posted in porter novelli, twitter | 5 Comments »
5 straightforward ways to integrate your communications activities
January 29th, 2009
Using digital channels in tight association with others helps get the highest value from campaigns. All too often though integration is at best an afterthought and at worst ignored.
This is the triangle I draw when I’m trying to explain how to integrate digital comms into a client’s other activities. It provides one way of thinking about the challenges and opportunities that face us, and can stimulate better ideas.
In the interests of keeping it short, this post is going to be pretty theoretical. In future posts I’ll cover some practical case studies and refer back to this post. Think of this as laying the groundwork.
Here — in brief review — is some of what we know about the three corners.
Read the rest of this entry »
Tags: comms planning, digital marketing, integration, web marketing
Posted in opinion | 4 Comments »










