<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>mediaczar &#187; blogger typology</title>
	<atom:link href="http://mediaczar.com/blog/category/research/blogger-typology/feed/" rel="self" type="application/rss+xml" />
	<link>http://mediaczar.com/blog</link>
	<description>a blog by mat morrison</description>
	<lastBuildDate>Mon, 22 Mar 2010 12:12:27 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>The Technorati Authority Yahoo! Pipe</title>
		<link>http://mediaczar.com/blog/2009/01/technorati-authority-yahoo-pipe/</link>
		<comments>http://mediaczar.com/blog/2009/01/technorati-authority-yahoo-pipe/#comments</comments>
		<pubDate>Wed, 07 Jan 2009 19:52:33 +0000</pubDate>
		<dc:creator>Mat Morrison</dc:creator>
				<category><![CDATA[blogger typology]]></category>
		<category><![CDATA[hack]]></category>
		<category><![CDATA[how to]]></category>
		<category><![CDATA[pipes]]></category>
		<category><![CDATA[google docs]]></category>
		<category><![CDATA[google spreadsheet]]></category>
		<category><![CDATA[technorati]]></category>
		<category><![CDATA[yahoo]]></category>

		<guid isPermaLink="false">http://mediaczar.com/blog/?p=319</guid>
		<description><![CDATA[Over the holidays, I started playing with a new Yahoo! pipe to pull information from Technorati into a spreadsheet. The reasons why I wanted to do this are covered in this post about the quantitative analysis of blogs, and my eventual perl-based solution to the problem is covered in this post. The problem with the [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Ftechnorati-authority-yahoo-pipe%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Ftechnorati-authority-yahoo-pipe%2F&amp;source=mediaczar&amp;style=normal&amp;service=bit.ly&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<div id="attachment_365" class="wp-caption alignnone" style="width: 394px"><a href="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipe.png"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipe.png" alt="Yahoo! Pipe to pull Technorati API data for multiple blogs" title="Yahoo! Pipe to pull Technorati API data for multiple blogs" width="384" height="526" class="size-full wp-image-365" /></a><p class="wp-caption-text">Yahoo! Pipe to pull Technorati API data for multiple blogs</p></div><br />
Over the holidays, I started playing with a new Yahoo! pipe to pull information from Technorati into a spreadsheet. The reasons why I wanted to do this are covered in this post about the <a href="http://mediaczar.com/blog/2009/01/blogger-typology-quantitative-analysis-step-1/">quantitative analysis of blogs</a>, and my eventual perl-based solution to the problem is covered in <a href="http://mediaczar.com/blog/2009/01/a-simple-perl-script-to-interrogate-the-technorati-api/comment-page-1/#comment-528">this post.</a></p>
<p>The problem with the perl-based approach is that it&#8217;s a little inaccessible to people who aren&#8217;t comfortable using a command line environment. So I really wanted to make something that more people would feel comfortable using, and perhaps play around with.</p>
<p>So, with some help and kind words from <a href="http://www.semdevel.com">Bob Briski</a>, one of whose pipes I&#8217;d stumbled across and <a href="http://delicious.com/mediaczar/pipes">bookmarked</a> during my research for this project, I decided to finish off the pipe and publish it so that others could use it, or (better still) improve upon it.<br />
<span id="more-319"></span></p>
<h3>What does it do?</h3>
<p>The pipe first pulls a list of blog URLs from a Google spreadsheet, then checks Technorati to get the &#8220;inbound blogs&#8221; count for each blog, then outputs the blog title, link and count to an RSS feed. </p>
<p>Technorati Authority (a useful metric when ranking a list of blogs) is calculated as &#8220;the unique blogs linking to a blog over the past six months&#8221; &#8212; in a way it&#8217;s a measure of how many readers may be influenced by a blogger <em>at one remove</em>.</p>
<h3>Before you start</h3>
<p>You&#8217;ll need:</p>
<ol>
<li>An API key from Technorati</li>
<li>A Yahoo! account</li>
<li>A Google account</li>
</ol>
<p>I&#8217;ve done all the rest, so you could simply clone my pipe, which <a href="http://pipes.yahoo.com/mediaczar/technorati_api_google_spreadsheet ">you&#8217;ll find here</a>. </p>
<h3>Publishing a Google spreadsheet as a CSV source</h3>
<p>In my first pass at this challenge, I used a simple text list of URLs stored on this server. That&#8217;s pretty easy if you have (a) a server, (b) a text editor, and (c) some way &#8212; like ftp or ssh &#8212; of getting the two to talk to each other. But lots of people don&#8217;t have servers these days, and still more probably have no idea what I mean by (c), so I decided to follow a new approach before writing this article. <em>Everyone</em>, I thought, must have access to Google Docs.</p>
<p>So here&#8217;s my test list as published on Google:<br />
<iframe width='500' height='150' frameborder='0' src='http://spreadsheets.google.com/pub?key=p4QDp5UmTKxTf0FacFsewJw&#038;output=html&#038;gid=0&#038;single=true&#038;range=A1:A7'></iframe></p>
<p>Now all I have to do is <strong>Share > Publish as a web page</strong>, then click <strong>More publishing options</strong> and select &#8216;CSV&#8217; or &#8216;TXT&#8217; (1) as the publishing option to create a simple text list.</p>
<p><div id="attachment_328" class="wp-caption alignnone" style="width: 494px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/publishing-google-spreadsheet-as-a-csv.jpg" alt="publishing google spreadsheet as a csv" title="publishing google spreadsheet as a csv" width="484" height="643" class="size-full wp-image-328" /><p class="wp-caption-text">publishing google spreadsheet as a csv</p></div>
<p>You can <a href="http://spreadsheets.google.com/pub?key=p4QDp5UmTKxTf0FacFsewJw&#038;output=csv">see the CSV file here</a>. You need to copy (or make a note of) the link (2) because we use it as the seed for our pipe.</p>
<p> One of the nice things about using Google Docs is that it&#8217;s easy to edit, view and share the list. For whatever reason, you may not want to share your list of blogs so if you&#8217;d like the original version with the server-based file which is more &#8216;private&#8217;, <a href="http://pipes.yahoo.com/mediaczar/technoratiapiauthorityquery">it&#8217;s still available</a>.</p>
<h3>Setting up the Pipe</h3>
<p>The first thing to do is to build a URL for each request I&#8217;m going to send to the API. Technorati takes requests in the following format:<br />
<code>http://api.technorati.com/bloginfo?key=[apikey]&#038;url=[blog url]</code></p>
<div id="attachment_322" class="wp-caption alignnone" style="width: 645px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipes-set-up1.jpg" alt="setting up the pipe" title="setting up the pipe" width="635" height="511" class="size-full wp-image-322" /><p class="wp-caption-text">setting up the pipe</p></div>
<p>The pipe takes two variables that will need to be customized for your own version. One, rather obviously, is the link URL for the Google CSV that you copied or made a note of above (you <em>did</em> make a note of it, didn&#8217;t you?). The other is your <a href="http://technorati.com/developers/apikey.html">Technorati API key</a> for which you&#8217;ll probably need to <a href="http://technorati.com/developers/apikey.html">sign up</a>. It&#8217;s free, but is limited to 500 calls to Technorati&#8217;s database per day.</p>
<p>I&#8217;ve plugged the link into the <strong>Fetch CSV</strong> module at top left, and the API key into the <strong>Private String</strong> module. I use this module so that I can protect my API key &#8212; no-one else should be able to see it now.</p>
<p>The <strong>URL Builder</strong> module placed within a <strong>Loop</strong> module does this. I&#8217;ve hooked the API Key from the <strong>Private String</strong> into the parameter named &#8216;key&#8217;, and the output of the <strong>Fetch CSV</strong> module into the &#8216;url&#8217; parameter.</p>
<h3>Getting the data from Technorati</h3>
<div id="attachment_323" class="wp-caption alignnone" style="width: 463px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipes_-get-xml1.jpg" alt="getting the XML by calling technorati&#039;s API" title="getting the XML by calling technorati&#039;s API" width="453" height="291" class="size-full wp-image-323" /><p class="wp-caption-text">getting the XML by calling technorati's API</p></div>
<p>Now it&#8217;s simply a matter of looping through the requests that I built during the set-up stage above, and sending each one to Technorati so that I can fetch the data. Technorati responds with an XML file that looks like this:</p>
<pre>[code lang="xml"]
<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Technorati API version 1.0" -->
<!DOCTYPE tapi PUBLIC "-//Technorati, Inc.//DTD TAPI 0.02//EN"
"http://api.technorati.com/dtd/tapi-002.xml">
<tapi version="1.0">
<document>
    <result>
        <url>http://www.mediaczar.com/blog</url>
             <weblog>
                <name>Mediaczar</name>
                <url>http://mediaczar.com/blog</url>
                <rssurl></rssurl>
                <atomurl>http://feeds.feedburner.com/mediaczar/posts</atomurl>
                <inboundblogs>14</inboundblogs>
                <inboundlinks>21</inboundlinks>
                <lastupdate>2009-01-06 12:45:23 GMT</lastupdate>
                <rank>422833</rank>
                <authors>
                    <author>
                        <username>mediaczar</username>
                        <name>Mat Morrison</name>
                        <description>Mat Morrison is a digital marketing and
                        communications strategist with over a decade's
                        experience in online advertising, eCRM, and social
                        media.</description>
                        <url>http://technorati.com/people/technorati/mediaczar</url>
<photourl>http://static.technorati.com/progimages/photo.jpg?
                        uid=139758</photourl>
                    </author>
                </authors>
             </weblog>
                <inboundblogs>14</inboundblogs>
                <inboundlinks>21</inboundlinks>
    </result>
</document>
</tapi>
[/code]
</pre>
<p>I can choose which bit of the XML file I want to receive by setting the &#8216;path to item list&#8217; parameter in the <strong>Fetch Data</strong> module. You&#8217;ll see that I&#8217;ve set it to &#8216;document.result&#8217; &#8212; compare that to the XML example above and I think you&#8217;ll see how this works: I&#8217;ve removed the wrapper information from the file and gone straight to the meat. </p>
<h3>Renaming the fields to suit RSS output</h3>
<p><div id="attachment_326" class="wp-caption alignnone" style="width: 657px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipes_-rename-and-output.jpg" alt="renaming the fields to suit RSS output" title="renaming the fields to suit RSS output" width="647" height="398" class="size-full wp-image-326" /><p class="wp-caption-text">renaming the fields to suit RSS output</p></div><br />
Yahoo! Pipes output is limited to RSS, so we need to reformat the Technorati XML as RSS. Now Bob&#8217;s shown me how to do it, it&#8217;s pretty straightforward: RSS needs a <strong>title</strong>, <strong>link</strong>, and <strong>description</strong>. So I&#8217;m using the <strong>Rename</strong> module to do just that, and choosing the &#8216;item.inboundblogs&#8217; parameter as the description. I plug the results into the <strong>Pipe Output</strong>, and I&#8217;m ready to go. Save the pipe.</p>
<h3>Checking the output of the Pipe</h3>
<p>Now I&#8217;ve saved the pipe, I can run it to see if it&#8217;s working.<br />
<div id="attachment_325" class="wp-caption alignnone" style="width: 697px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipes_-output.jpg" alt="checking the output of the pipe" title="checking the output of the pipe" width="687" height="420" class="size-full wp-image-325" /><p class="wp-caption-text">checking the output of the pipe</p></div><br />
Sure enough, there&#8217;s the Authority figure coming through nice and clearly (1). Success! But there&#8217;s something more we can do. Clicking on the <strong>More options</strong> link (2) gives us an opportunity to take the RSS feed back into other tools like a feedreader, or into NetVibes (I&#8217;m a big fan.) Or, given that we started in Google, we could take it back there&#8230;</p>
<h3>Importing the RSS feed back into the Google Spreadsheet</h3>
<p><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipes_-importfeed1.jpg" alt="Google&#039;s &quot;importFeed&quot; function" title="Google&#039;s &quot;importFeed&quot; function" width="780" height="131" class="size-full wp-image-324" /><br />
Google Spreadsheets has some useful functions that you don&#8217;t find in Excel; and which are more geared towards the web. The only one that I&#8217;ve really played with is the <strong>importFeed</strong> function which &#8212; as you might expect &#8212; imports RSS feeds into your spreadsheet. So here, we paste the RSS link from the pipe into the sheet&#8230;<br />
<div id="attachment_337" class="wp-caption alignnone" style="width: 437px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipes_google-spreadsheet-w-data-1.jpg" alt="google spreadsheet showing output from importFeed function" title="google spreadsheet showing output from importFeed function" width="427" height="269" class="size-full wp-image-337" /><p class="wp-caption-text">google spreadsheet showing output from importFeed function</p></div><br />
&#8230; and it fills in our sheet for us. Now I can change the list of blogs on one sheet, and (as if by magic) the results will appear on the other, without my needing to go into Pipes. I&#8217;ve used Yahoo! Pipes to link one Google Spreadsheet to another by way of Technorati. Fun? I should say so. Useful? Most certainly.</p>
]]></content:encoded>
			<wfw:commentRss>http://mediaczar.com/blog/2009/01/technorati-authority-yahoo-pipe/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Blogger typology: using IBM&#8217;s Many Eyes to build matrix charts</title>
		<link>http://mediaczar.com/blog/2009/01/matrix-charts-showing-consensus/</link>
		<comments>http://mediaczar.com/blog/2009/01/matrix-charts-showing-consensus/#comments</comments>
		<pubDate>Tue, 06 Jan 2009 12:20:07 +0000</pubDate>
		<dc:creator>Mat Morrison</dc:creator>
				<category><![CDATA[blogger typology]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[bloggers]]></category>
		<category><![CDATA[charts]]></category>
		<category><![CDATA[ibm]]></category>
		<category><![CDATA[many eyes]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://mediaczar.com/blog/?p=280</guid>
		<description><![CDATA[Thanks to IBM&#8217;s Many Eyes service it&#8217;s relatively simple to create complicated visualizations that my current version of Excel can&#8217;t handle. For example, this &#8220;matrix chart&#8221; that I built using Excel&#8217;s bubble chart function is clearly unacceptable. I can&#8217;t easily link statements or values to the X and Y axes, and there&#8217;s lots of overlapping [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Fmatrix-charts-showing-consensus%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Fmatrix-charts-showing-consensus%2F&amp;source=mediaczar&amp;style=normal&amp;service=bit.ly&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Thanks to IBM&#8217;s <a href="http://manyeyes.alphaworks.ibm.com/manyeyes/">Many Eyes</a> service it&#8217;s relatively simple to create complicated visualizations that my current version of Excel can&#8217;t handle. For example, this &#8220;matrix chart&#8221; that I built using Excel&#8217;s bubble chart function is clearly unacceptable. I can&#8217;t easily link statements or values to the X and Y axes, and there&#8217;s lots of overlapping that seems (after many attempts) to be impossible to fix. </p>
<p><div id="attachment_286" class="wp-caption alignnone" style="width: 310px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/excel-matrix-chart-300x166.png" alt="Matrix chart built using Excel - not very satisfactory!" title="Matrix chart built using Excel" width="300" height="166" class="size-medium wp-image-286" /><p class="wp-caption-text">Matrix chart built using Excel</p></div><br />
<span id="more-280"></span><br />
I&#8217;d considered using something like <a href="http://processing.org/">Processing</a> to draw these charts, then remembered Many Eyes. Sure enough, not only does the service help me build matrix charts, but it also lets me embed the visualizations into this blog post and lets you interact with the data (feel free to click on the image below and have a quick play.)</p>
<p><script type="text/javascript" src="http://manyeyes.alphaworks.ibm.com/manyeyes/visualizations/f199407cdbde11dd80c0000255111976/comments/f19c23fadbde11dd80c0000255111976.js?width=400&#038;height=350"></script></p>
<p>This chart depicts the statements that generated most &#8220;consensus&#8221; (either agreement or disagreement.) While this might be interesting to some people, I&#8217;m actually looking for areas where there&#8217;s least consensus &#8212; the whole point of the blogger typology is to identify those variables that will, so to speak, separate sheep from goats and apples from oranges. To continue with this spurious analogy, I&#8217;m looking less for statements like &#8220;I am a ruminant&#8221; or &#8220;I am a fruit&#8221; and more for statements like &#8220;I enjoy eating laundry&#8221; and &#8220;I am a citrus.&#8221;</p>
<p>So here&#8217;s a quick look at <em>those</em> statements:</p>
<p><script type="text/javascript" src="http://manyeyes.alphaworks.ibm.com/manyeyes/visualizations/b656e486dbea11dd80c0000255111976/comments/b65c2720dbea11dd80c0000255111976.js?width=400&#038;height=350"></script></p>
<p>You should immediately be able to see that these responses are much less polarised than in the previous chart. That&#8217;s the sort of thing that we&#8217;re looking for.</p>
<h4>Caveat</h4>
<p>Please bear in mind that this current sample is self-selecting, and sourced online via Twitter, this blog, LinkedIn and WOM referrals. What I&#8217;m working on here are the techniques I&#8217;m going to be using, rather than the actual analysis. I&#8217;m really quite into the prototype-and-fix approach to planning (more like &#8220;get it up and fix it later&#8221; &#8212; it means I spend less time beard-stroking and more time in front of a martini.)  So the current results aren&#8217;t very robust, but they will help us focus our future research.</p>
<h4>Where can I download the latest data?</h4>
<p><a href="http://spreadsheets.google.com/ccc?key=p4QDp5UmTKxTaiQVRkL4yWQ">Here, as a Google Docs spreadsheet</a>. If you do anything interesting with the data, do please let me know, and I&#8217;ll link to it from here. Feel free to share this spreadsheet as <a href="http://icanhaz.com/blogger_data">http://icanhaz.com/blogger_data</a></p>
]]></content:encoded>
			<wfw:commentRss>http://mediaczar.com/blog/2009/01/matrix-charts-showing-consensus/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Blogger typology: quantitative analysis step 1</title>
		<link>http://mediaczar.com/blog/2009/01/blogger-typology-quantitative-analysis-step-1/</link>
		<comments>http://mediaczar.com/blog/2009/01/blogger-typology-quantitative-analysis-step-1/#comments</comments>
		<pubDate>Sun, 04 Jan 2009 17:19:39 +0000</pubDate>
		<dc:creator>Mat Morrison</dc:creator>
				<category><![CDATA[blogger typology]]></category>
		<category><![CDATA[measurement]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[bloggers]]></category>
		<category><![CDATA[box plot]]></category>
		<category><![CDATA[boxplot]]></category>
		<category><![CDATA[conversation index]]></category>
		<category><![CDATA[excel]]></category>
		<category><![CDATA[getafreelancer]]></category>
		<category><![CDATA[graph]]></category>
		<category><![CDATA[mechanical turk]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://mediaczar.com/blog/?p=254</guid>
		<description><![CDATA[I&#8217;ve published the first dump of survey and &#8220;blog metrics&#8221; data from the blogger questionnaire as a spreadsheet on Google Docs. Many, many thanks to all of you who volunteered your information. Please feel free to use this as you see fit for your own projects. I&#8217;ve anonymised this data (just because it&#8217;s best practice, [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Fblogger-typology-quantitative-analysis-step-1%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Fblogger-typology-quantitative-analysis-step-1%2F&amp;source=mediaczar&amp;style=normal&amp;service=bit.ly&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<div style="float: right; margin-left: 10px; margin-bottom: 10px;">
<a href="http://www.flickr.com/photos/lost-moments/437570588/" title="Propeller-Heads by Danz in Tokyo on Flickr"><img src="http://farm1.static.flickr.com/174/437570588_eb6fefb5e0_m.jpg" alt="Propeller-Heads by Danz in Tokyo on Flickr" style="border: solid 2px #000000;" /></a>
</div>
<p>I&#8217;ve published the first dump of survey and &#8220;blog metrics&#8221; data from the <a href="http://mediaczar.com/blog/2008/12/your-help-needed-to-develop-blogger-typology/">blogger questionnaire</a> as <a href="http://spreadsheets.google.com/ccc?key=p4QDp5UmTKxTaiQVRkL4yWQ">a spreadsheet on Google Docs</a>. Many, many thanks to all of you who volunteered your information.</p>
<p>Please feel free to use this as you see fit for your own projects. I&#8217;ve anonymised this data (just because it&#8217;s best practice, not because I think any blogger would be mortally offended by having the world know what inspires them to blog!)<br />
<span id="more-254"></span><br />
I&#8217;m slowly ploughing through the data and doing the quant analysis, but I thought I&#8217;d share a few bits and pieces first.</p>
<p>I&#8217;m not really in a position to offer anything other than very top-line results right now. Remember that the purpose of this exercise is to create a broad and commonsensical &#8220;blogger typology&#8221; that will help us with our planning and engagement programmes.</p>
<p>All I have here are &#8220;descriptive&#8221; statistics; that is, data about the data themselves. You can <a href="#results">jump straight to the results</a> should you so wish. The next bit is just a rumination of what we <em>can</em> measure (I&#8217;ve left out the &#8220;why&#8221; for now, but would be happy to discuss.)</p>
<h3>Blog metrics background</h3>
<p>First of all, a quick aside about how we chose what we&#8217;re going to measure. This has been a source of great interest to us at Porter Novelli, and I think will continue to be so over the coming year.</p>
<p>There are any mumber of blog metrics out there to choose from. I&#8217;m rather interested in Stowe Boyd&#8217;s <a href="http://www.stoweboyd.com/message/2006/02/the_social_scal.html">Conversation Index</a>, which he defines as:</p>
<blockquote><p>The ratio between posts and comments+trackbacks (posts/comments+trackbacks)</p></blockquote>
<p>A healthy blog, he suggests, has a Conversation Index less than 1.0 &#8212; that is, there&#8217;s something more than simple broadcasting going on. To me there seem to be only two downsides to this metric: </p>
<ol>
<li>a poorly-moderated blog with <em><a href="http://www.dofollowblogs.com/">do follow</a></em> links in the comments section will often have a lower conversation index than the norm.</li>
<li>it&#8217;s hard to generate Conversation Index data automatically for a large number of blogs</li>
</ol>
<p>This says nothing about using the Conversation Index as a performance indicator for one&#8217;s own blogs and posts of course (indeed, it&#8217;s to be recommended.) It would probably work well for Twitter accounts, too &#8211; we&#8217;ll try to take a look at this when we&#8217;ve finally completed the twitter eavesdropper.</p>
<p>But taking Stowe Boyd&#8217;s post as inspiration, here&#8217;s what I&#8217;d like to be able to collect and store from the blogs I&#8217;m looking at. These are, I think, the relevant data points that are common to nearly all blogs, and from which a more complex and meaningful set of metrics (like Boyd&#8217;s Conversation Index, Technorati&#8217;s Authority, or indeed Google&#8217;s Page Rank, AideRSS&#8217;s <a href="http://www.postrank.com/postrank">PostRank</a>, and even Porter Novelli&#8217;s own <a href="http://www.flickr.com/photos/porternovelli/sets/72157610638659070/">network analysis</a> metrics) can be constructed. </p>
<p><iframe width='500' height='300' frameborder='0' src='http://spreadsheets.google.com/pub?key=p4QDp5UmTKxQSPBzj6_gZXQ&#038;output=html&#038;widget=true'></iframe></p>
<p>At this point it&#8217;s worth inserting a note of reality. As mentioned above, collecting this data is difficult. Since every blog is different, every blog&#8217;s schema (the way that its information is structured) <a href="http://microformats.org/wiki/blog-post-examples">differs</a>. XML feeds like RSS and Atom go a long way to fixing this (hence the focus on &#8220;last 10 posts&#8221; for some of the metrics listed above) but even then it&#8217;s hard to automate.</p>
<p><a href="http://www.flickr.com/photos/porternovelli/3166298905/" title="Pipes: Autodiscover Comments Feed by matmorrison, on Flickr"><img src="http://farm4.static.flickr.com/3104/3166298905_d3b2dd3cab.jpg" width="500" height="376" alt="Pipes: Autodiscover Comments Feed" /></a></p>
<p>So &#8212; right now I&#8217;ve been using the <a href="https://www.mturk.com/mturk/help?helpPage=overview#what_is">mechanical turk</a> approach to getting the data. I mostly use the excellent and reliable <a href="http://getafreelancer.com">Get A Freelancer</a>, but given that I&#8217;m part of a global network I&#8217;m also looking at Amazon&#8217;s service (I don&#8217;t understand why this isn&#8217;t available outside the United States.) I like the idea of paying per-transaction; it offers more flexibility.</p>
<p>The big problem (as with any data entry project) is accuracy, and I&#8217;m still working on the processes for that (although I <em>do</em> have a couple of ideas.) I&#8217;d be interested to hear from anyone who has some experience in this area so that we can share ideas.</p>
<p><a name="results"><br />
<h3>Early results</h3>
<p></a><br />
Our standard set of blog metrics are based on well-known CRM RFM metrics: recency, frequency, and tenure. Obviously we can&#8217;t get &#8220;spend&#8221; data (the &#8220;M&#8221; in RFM), but we can substitute it with Technorati&#8217;s &#8220;Authority&#8221; data. This seems (and is) pretty arbitrary &#8212; but it&#8217;s relatively easy to automate as demonstrated in yesterday&#8217;s post on <a href="http://mediaczar.com/blog/2009/01/a-simple-perl-script-to-interrogate-the-technorati-api/">using perl to access Technorati&#8217;s API</a>.</p>
<p>Let&#8217;s look at the data we&#8217;ve gathered. </p>
<h4>Recency</h4>
<p>We calculate &#8220;recency&#8221; as <strong>the number of days since the latest post</strong> (in Excel we subtract &#8220;date of last post&#8221; from &#8220;date of retrieval&#8221;.) When it comes to recency, the lower the score the better: it implies an active blogger.</p>
<p>The first box plot is heavily skewed by one outlier who hasn’t posted for over three years.</p>
<div id="attachment_256" class="wp-caption alignnone" style="width: 462px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/recency.png" alt="Box plot describing posting recency of 62 bloggers who completed questionnaire" title="Box plot describing posting recency of 62 bloggers who completed questionnaire" width="452" height="468" class="size-full wp-image-256" /><p class="wp-caption-text">Box plot describing posting recency of 62 bloggers who completed questionnaire</p></div>
<p>So in the following box-plot we’ve removed that outlier; none of the other numbers really changes of course, but we can display a better picture.</p>
<div id="attachment_257" class="wp-caption alignnone" style="width: 462px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/recency_adjusted.png" alt="Box plot describing posting recency of 61 bloggers (one outlier removed)" title="Box plot describing posting recency of 61 bloggers (one outlier removed)" width="452" height="468" class="size-full wp-image-257" /><p class="wp-caption-text">Box plot describing posting recency of 61 bloggers (one outlier removed)</p></div><br />
You can see that half the bloggers have posted within the past four days. The retrieval date for this data was January 2nd, so this is really rather impressive. If you look at this <a href="http://www.scribd.com/doc/8223231/Blog-Competitor-Analysis">review of corporate blogging programmes from public relations agency networks</a> that we carried out for internal purposes early last year, you’ll see that there’s far less enthusiasm (the median recency score there was 12 days) and that wasn&#8217;t over a period where everyone was having a holiday and a party.</p>
<h4>Frequency</h4>
<p>To calculate frequency we look at <strong>the number of posts in the last complete month</strong>. In some studies, frequency of posting has shown a high correlation with Technorati Authority (of which more later.) I expect to see a wide range of frequency “behaviours” when we come to do the segmentation; people who are mostly link blogging will, I think, show higher frequency than people who are opinion blogging or announcement blogging (these terms are borrowed from current typologies, and have nothing to do with the final product!) It’s hard to say therefore that “higher frequency is better” but again, it’s an indicator of active and engaged blogging. Or possibly, of course, of a spam blog. </p>
<p>From the public relations engagement point of view, a blog with higher frequency will have a higher probability of carrying our story and (in all likelihood) a larger readership.<br />
<div id="attachment_259" class="wp-caption alignnone" style="width: 463px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/frequency.png" alt="Box plot describing frequency of posting of 62 bloggers who completed questionnaire" title="Box plot describing frequency of posting of 62 bloggers who completed questionnaire" width="453" height="469" class="size-full wp-image-259" /><p class="wp-caption-text">Box plot describing frequency of posting of 62 bloggers who completed questionnaire</p></div><br />
There are no clear outliers here: half the respondents post more than seven times a month, so just under two posts a week. In case you hadn’t guessed, it’s probably worth pointing out that the blog with a frequency of 225 is a multi-authored blog.</p>
<p>The “complete month” we looked at for this exercise was December; numbers are likely to have been affected by the holidays (and affected differently for different segments.)</p>
<h4>Active tenure</h4>
<p>We define ‘active tenure’ as <strong>the number of days between the first and the last post</strong>. When taken together with recency and frequency, this is a good measure of commitment: someone may have high recency and frequency scores, but if their tenure is low, then we have no guarantee that this behaviour will continue into the future. Lots of us start blogs in a fit of enthusiasm only to find that work begins to get in the way, and that we have less time to post than we did! Occasionally we&#8217;ll find a blog that has relatively high active tenure and low recency; we&#8217;d read that as a bad sign.</p>
<p><div id="attachment_265" class="wp-caption alignnone" style="width: 464px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/tenure1.png" alt="Box plot describing active tenure of 62 bloggers who completed questionnaire" title="Box plot describing active tenure of 62 bloggers who completed questionnaire" width="454" height="470" class="size-full wp-image-265" /><p class="wp-caption-text">Box plot describing active tenure of 62 bloggers who completed questionnaire</p></div><br />
Looking at the respondents, we see that half of them have been blogging for more than a year (401 days); this is a good sign for when we come to do the segmentation. Indeed the distribution looks fairly healthy overall with an mid range from 149  to 952 days. I still can&#8217;t say that this is a representative sample at all, which is a shame. Does anyone know of any data that would help me frame this better?</p>
<h4>Authority</h4>
<p>Technorati&#8217;s &#8220;Authority&#8221; is the only metric that we don&#8217;t calculate ourselves. Indeed, as per yesterday&#8217;s post, I&#8217;ve just succeeded in <a href="http://mediaczar.com/blog/2009/01/a-simple-perl-script-to-interrogate-the-technorati-api/">automating this bit of the data gathering process</a>.</p>
<p>Technorati has this to say:</p>
<blockquote><p>Technorati Authority is <strong>the number of blogs linking to a website in the last six months</strong>. The higher the number, the more Technorati Authority the blog has &#8230; The best way to increase your Technorati Authority is to write things that are interesting to other bloggers so they&#8217;ll link to you.</p></blockquote>
<p>Right now, Technorati&#8217;s Authority scores range from 0 to 28,378 (HuffPo, as retrieved on Sunday January 4, 2009.) Our respondents (as seen below) range from 0 to 356, with a mid range from 5 to 41. I don&#8217;t think that this really gets us where we want to be going (although it does guide our hand a little better when it comes to looking at the next rounds of data gathering.)</p>
<p><div id="attachment_258" class="wp-caption alignnone" style="width: 461px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/authority.png" alt="Box plot describing Technorati Authority of 62 bloggers who completed questionnaire" title="Box plot describing Technorati Authority of 62 bloggers who completed questionnaire" width="451" height="467" class="size-full wp-image-258" /><p class="wp-caption-text">Box plot describing Technorati Authority of 62 bloggers who completed questionnaire</p></div>
<h4>A word about the box plots</h4>
<p>We use box plots as a simple way to represent and compare data. Simply plotting the average (arithmetic mean) of the data disguises more than it reveals; a few outliers (as we have seen) can drastically alter the picture and give an &#8220;unrepresentative&#8221; picture. </p>
<p>Each box plot shows the following information:<br />
<a href="http://www.flickr.com/photos/porternovelli/3167891148/" title="Box plot by matmorrison, on Flickr"><img src="http://farm2.static.flickr.com/1033/3167891148_9882a29c9f.jpg" width="336" height="444" alt="Box plot" /></a></p>
<ol>
<li>Maximum (the highest value observed)</li>
<li>Minimum (the lowest value observed)</li>
<li>The Interquartile Range or IQR (where the middle 50% of the values fall: this is a more robust statistic than the full range, and is depicted as the &#8220;box&#8221; from which the box plot gets its name)</li>
<li>The Median (the number separating the higher 50% of the sample from the lower 50%, the median gives you an idea of which way the data skew.)</li>
</ol>
<h3>Where can I get the latest data</h3>
<p><a href="http://spreadsheets.google.com/ccc?key=p4QDp5UmTKxTaiQVRkL4yWQ">Here, as a Google Docs spreadsheet</a>. If you do anything interesting with the data, do please let me know, and I&#8217;ll link to it from here. Feel free to share this spreadsheet as <a href="http://icanhaz.com/blogger_data">http://icanhaz.com/blogger_data</a></p>
<h3>How were the respondents recruited?</h3>
<p>To date, everyone has been self-selecting. I&#8217;ve used four main channels to promote the questionnaire, this blog, Twitter, a LinkedIn Q&#038;A, and word-of-mouth. If you&#8217;d like to take the questionnaire yourself, <a href="http://icanhaz.com/blogger_questions">please do</a> &#8212; it&#8217;s never too late. Furthermore, I&#8217;d be most grateful if you&#8217;d pass the link along (please post it on your blog, or on twitter, or send it via email as <a href="http://icanhaz.com/blogger_questions">http://icanhaz.com/blogger_questions</a>).</p>
<p>Here (courtesy of Google spreadsheets) is how the current set of respondents got to the questionnaire (WOM isn&#8217;t tracked here, just the seed link.)<br />
<img width="450" height="320" src="http://spreadsheets.google.com/pub?key=p4QDp5UmTKxTaiQVRkL4yWQ&#038;oid=1&#038;output=image" /></p>
]]></content:encoded>
			<wfw:commentRss>http://mediaczar.com/blog/2009/01/blogger-typology-quantitative-analysis-step-1/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>A simple perl script to interrogate the Technorati API</title>
		<link>http://mediaczar.com/blog/2009/01/a-simple-perl-script-to-interrogate-the-technorati-api/</link>
		<comments>http://mediaczar.com/blog/2009/01/a-simple-perl-script-to-interrogate-the-technorati-api/#comments</comments>
		<pubDate>Sat, 03 Jan 2009 19:34:05 +0000</pubDate>
		<dc:creator>Mat Morrison</dc:creator>
				<category><![CDATA[blogger typology]]></category>
		<category><![CDATA[hack]]></category>
		<category><![CDATA[how to]]></category>
		<category><![CDATA[api]]></category>
		<category><![CDATA[bloggers]]></category>
		<category><![CDATA[blogs]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[technorati]]></category>

		<guid isPermaLink="false">http://mediaczar.com/blog/?p=220</guid>
		<description><![CDATA[Sometimes (for instance when I&#8217;m doing the research for the blogger typology) you need to get a whole load of Technorati data for a whole load of blogs. This research can (of course) be done by hand. And (of course) for a long list of blogs this would take a great deal of time. Handily, [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Fa-simple-perl-script-to-interrogate-the-technorati-api%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Fa-simple-perl-script-to-interrogate-the-technorati-api%2F&amp;source=mediaczar&amp;style=normal&amp;service=bit.ly&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://www.flickr.com/photos/porternovelli/3163377107/sizes/o/" title="Technorati API perl query in action by matmorrison, on Flickr"><img src="http://farm4.static.flickr.com/3092/3163377107_135a68325a.jpg" width="500" height="167" alt="Technorati API perl query in action" /></a></p>
<p>Sometimes (for instance when I&#8217;m doing the research for the <a href="http://mediaczar.com/blog/2008/12/your-help-needed-to-develop-blogger-typology/">blogger typology</a>) you need to get a whole load of Technorati data for a whole load of blogs.</p>
<p>This research can (of course) be done by hand. And (of course) for a long list of blogs this would take a great deal of time. Handily, Technorati provides developers with <a href="http://technorati.com/developers/api/">an API that lets you automate those queries</a>. An API (for those of you who don&#8217;t know) is an <em>Application Programming Interface</em> &#8211; a toolkit provided by a service or application (in this case by Technorati) that lets other computer applications ask it questions and use the answers for their own purposes. It may be helpful to think of APIs as being like the knobs on top of a Lego brick that let you stick other Lego on to it without in any way changing the nature of the brick itself. On the other hand it may not be so helpful after all.<br />
<span id="more-220"></span><br />
After much struggling with a Yahoo! Pipe to <a href="http://pipes.yahoo.com/mediaczar/technoratiapiauthorityquery">query the Technorati API for a list of blogs</a>, I was forced to abandon my attempt. I would have liked to have shared that Pipe with the world (if you&#8217;re good with Yahoo! Pipes, do please take a look at it and see if you can help me!) <ins datetime="2009-01-06T23:11:07+00:00">[Tuesday January 6, 2009: Thanks to help and encouragement from <a href="http://www.semdevel.com">Bob Briski</a>, this now looks like it's on its way to working!]</ins></p>
<p>Instead, I&#8217;ve written a perl script to do this. Perl isn&#8217;t as easy for people to use for themselves as Pipes, but if you are comfortable with a command prompt, then you&#8217;re half way there.</p>
<p>What this script does is take a list of blog urls, and for each item in the list queries Technorati for the following information:</p>
<ol>
<li>Blog title</li>
<li>Inbound blogs (the number of unique external blogs linking to the blog over the past six months, this is also known as &#8220;Technorati Authority&#8221;)</li>
<li>Inbound links (the total number of links into the site)</li>
<li><a href="http://technorati.com/help/faq.html#ranking">Technorati Rank</a> (a sort of overall score)
</ol>
<h3>The script</h3>
<p>[code lang="perl"]#!/usr/bin/perl<br />
# use modules<br />
use LWP::Simple;<br />
use XML::Simple;<br />
# set up variables<br />
open(INFILE,  $ARGV[0]) or die "Can't open list of blogs to read: $!";<br />
$apikey='enter your Technorati API key here';<br />
# create object<br />
$xml = new XML::Simple;<br />
# read each line, and make the Technorati API call<br />
while (<INFILE>) {<br />
	chomp;<br />
	&#038;callTechnoratiAPI;<br />
}<br />
sub callTechnoratiAPI {<br />
	$url = 'http://api.technorati.com/bloginfo?format=xml&#038;key='.$apikey.'&#038;url='.$_;<br />
	# get XML file from Technorati<br />
	$content = get $url;<br />
	  die "Can't get $url" unless defined $content;<br />
	# read XML file<br />
	$data = $xml->XMLin($content);<br />
	# access XML data and print TSV to screen<br />
	# (you can fiddle with this as much<br />
	# or as little as you like)<br />
	print ""$data->{document}->{result}->{weblog}->{name}"t";<br />
	print "$data->{document}->{result}->{url}t";<br />
	print "$data->{document}->{result}->{weblog}->{inboundblogs}t";<br />
	print "$data->{document}->{result}->{weblog}->{inboundlinks}t";<br />
	print "$data->{document}->{result}->{weblog}->{rank}n";<br />
}[/code]</p>
<h3>How to use it</h3>
<p>I can&#8217;t give you any real advice on how to run perl on your system. If you want to play around with it, Macs come with perl already installed, Windows users should download and install the free <a href="http://www.activestate.com/activeperl/">ActivePerl</a>. But you&#8217;ll need to install the perl bundle <a href="http://search.cpan.org/~grantm/XML-Simple-2.18/lib/XML/Simple.pm">XML::Simple</a>, and I don&#8217;t know where to begin telling you how to do that if you don&#8217;t already know how perl and CPAN work. You see why I wanted to use Yahoo! Pipes?</p>
<p>If all of that doesn&#8217;t bother you, you&#8217;ll also need to sign up for a Technorati account (if you&#8217;re into this sort of thing, you should <em>already</em> have an account), and get your <a href="http://technorati.com/developers/apikey.html">free API key</a>. This key will let you make 500 queries in a 24-hour period, so you&#8217;ll need to plan how you use it.</p>
<p>The script as it&#8217;s listed above outputs tab-separated values to screen like this:<br />
<code>matm% ./parse_technorati.pl bloglist.txt<br />
"Chris Gilmour's Diary Vol. 14"	http://www.illandancient.blogspot.com	6	10	861604<br />
"The Red Rocket: Technology, PR and social media marketing"	www.theredrocket.co.uk	15	29	397843<br />
"Going Underground's Blog"	http://london-underground.blogspot.com	254	467	13332<br />
</code><br />
The blog&#8217;s title and url are followed in order by the inbound blogs (authority) count, the inbound links count, and the Technorati rank. </p>
<p>I use tab-separated values because that makes it simple to cut-and-paste directly into Excel or Google Spreadsheets for further analysis.</p>
<h3>Known bugs</h3>
<p>Right now, the script occasionally throws out something like this:</p>
<p><code>matm% ./parse_technorati.pl bloglist.txt<br />
"Lytham Villa"	http://lythamvilla.blogspot.com/	<strong>HASH(0x8ff7a0)	HASH(0x8ff7f4)</strong>	4978471<br />
"KickTime || A Driftless Regional Webspace"	http://kicktime.org	<strong>HASH(0x908e0c)	HASH(0x908db8)</strong>	1951828</code></p>
<p>I&#8217;ll work on this, but if anyone can point me in the right direction, I&#8217;ll be most grateful.</p>
]]></content:encoded>
			<wfw:commentRss>http://mediaczar.com/blog/2009/01/a-simple-perl-script-to-interrogate-the-technorati-api/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Your help needed to develop &#8220;blogger typology&#8221;</title>
		<link>http://mediaczar.com/blog/2008/12/your-help-needed-to-develop-blogger-typology/</link>
		<comments>http://mediaczar.com/blog/2008/12/your-help-needed-to-develop-blogger-typology/#comments</comments>
		<pubDate>Sun, 21 Dec 2008 23:42:07 +0000</pubDate>
		<dc:creator>Mat Morrison</dc:creator>
				<category><![CDATA[blogger typology]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[bloggers]]></category>
		<category><![CDATA[blogging]]></category>
		<category><![CDATA[survey]]></category>

		<guid isPermaLink="false">http://mediaczar.com/blog/?p=146</guid>
		<description><![CDATA[(NB: If you have both a blog and a short attention span, please skip the article, and go straight to this short survey. Many thanks!) What is a blogger? Everyone seems to think they know, and yet the longer I work in this area, the more I realize I know nothing. And the less I [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2008%2F12%2Fyour-help-needed-to-develop-blogger-typology%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2008%2F12%2Fyour-help-needed-to-develop-blogger-typology%2F&amp;source=mediaczar&amp;style=normal&amp;service=bit.ly&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><strong>(NB: If you have both a blog and a short attention span, please skip the article, and go straight to <a href="http://www.surveymonkey.com/s.aspx?sm=mFca_2fPTMTqr2c690_2fSCPDQ_3d_3d">this short survey</a>. Many thanks!)</strong></p>
<p>What is a blogger? Everyone seems to think they know, and yet the longer I work in this area, the more I realize I know nothing. And the less I know, the more suspicious I become of marketers who use vague terms like &#8220;conversation&#8221; (which has &#8211; after all &#8211; become little more than a Latinization of the ghastly &#8220;dialogue&#8221;.) I can just about understand what Technorati means when they talk about</p>
<blockquote><p>The ecosystem of interconnected communities of bloggers and readers at the convergence of journalism and conversation.</p>
<p>(<cite><a href="http://technorati.com/blogging/state-of-the-blogosphere/">State of the Blogosphere 2008</a></cite>)</p>
</blockquote>
<p>&#8230;but there are an awful lot of long words that could turn out to hide an awful lot. And that&#8217;s the carefully thought-out distillation of a bunch of experts. Most of us, most of the time fall back on lazy or confusing language. We talk about &#8220;social media&#8221; and never stop to think that &#8212; depending on who&#8217;s doing the talking (and what they have to sell) &#8212; what is meant by that apparently innocuous phrase shifts wildly from speaker to speaker.<br />
<span id="more-146"></span><br />
I think that we do the world (and ourselves) as much of a disservice by lumping together a bunch of web sites based on the fact that they share a similar technology as we do if we can only lump all fiction, non-fiction, reference, text-books, guidebooks and manuals together as &#8220;books.&#8221; We <em>need</em> a better classification.</p>
<p>I&#8217;ve spent a lot of time thinking about this sort of thing, and now I think we&#8217;re ready to do something about it. But while my ambitions are actually rather modest, I need a lot of data to get started. This is where <em>you</em> come in. I really need your help.</p>
<p><strong>(Seriously, if you&#8217;ve read this far, and aren&#8217;t sure if you want to read the rest of the article, please don&#8217;t. But <em>please</em>, if you&#8217;re going to leave, take <a href="http://www.surveymonkey.com/s.aspx?sm=mFca_2fPTMTqr2c690_2fSCPDQ_3d_3d">this short survey</a>. Goodbye, thank you, and see you soon I hope!)</strong></p>
<p>OK. Here&#8217;s a quick distillation of what I&#8217;ve been thinking about that will help give us some context.</p>
<h3>All blogs are <em>not</em> the same</h3>
<p>This seems abundantly obvious. But I&#8217;m not talking about <em>topic</em> here. What else differs? Let&#8217;s look at what we know:</p>
<h4>Audience reach differs</h4>
<p>I am fond of drawing charts like this:<br />
<a href="http://www.flickr.com/photos/porternovelli/3124748195/" title="The dimensions of the blogosphere"><img src="http://farm4.static.flickr.com/3265/3124748195_aff260c6b7.jpg" width="500" height="375" alt="The dimensions of the blogosphere" /></a></p>
<p>We know that a few large blogs together reach the great majority of the audience. The great majority of blogs, on the other hand, will reach no more than a few dozen unique users, let alone unduplicated audience, <em>in their lifetime</em>. </p>
<p>And yet we don&#8217;t have the language to distinguish between the various areas on the spectrum. Sure, we&#8217;ve borrowed the term &#8220;A-list&#8221; from the entertainment industry to describe the really big guys. But is that going to be enough?</p>
<h4>Target audiences differ</h4>
<p>Different bloggers write for different audiences. They may write for their friends and family, their peers, their colleagues, or for &#8220;people who are into the same sort of thing.&#8221; Some write for themselves, for people they know, or people with whom they have developed a writer/reader relationship, while others write for one-off strangers who are typing search terms into Google.</p>
<p>Or they may write to satisfy or attract customers and clients. Or to communicate with journalists and analysts. It seems perverse to lump &#8220;corporate bloggers&#8221; in with the rest of the blogosphere.</p>
<h4>Blogger motivation differs</h4>
<p><a target="new" href="http://www.bivingsreport.com/resources/2006v2007v2008.gif"><img width="100%" src="http://www.bivingsreport.com/resources/2006v2007v2008.gif" alt="Chart showing top features for online newspapers 2006,2007,2008" /></a><br />
For the past three years, the Bivings Group has published research on <a href="http://www.bivingsreport.com/2008/the-use-of-the-internet-by-americas-largest-newspapers-2008-edition/">The Use of the Internet by America&#8217;s Largest Newspapers</a>. You can see (from the chart above &#8211; well, if you click and blow it up, you can see) that nearly all of them have featured &#8220;reporter blogs&#8221; for a couple of years. This highlights a point that&#8217;s worth making: at one end of the blogger audience spectrum are a bunch of people <em>who are paid to blog</em>. Whether they draw a salary, or a <a href="http://www.alleyinsider.com/2008/7/gawker-writers-pay-rates-get-cut-again">traffic-related bonus</a> their motivation seems pretty clear. As does that of the blogger who joins an ad programme, or an affiliate programme, or who seeks sponsorship.</p>
<p>Corporate bloggers are also driven by material needs; whether they&#8217;re trying to promote their product, service or organization, there&#8217;s a financial impetus behind it all. It may be harder to measure, but it&#8217;s pretty clear why they&#8217;re in the game. </p>
<p>Other bloggers, on the other hand may be motivated by deeper <a href="http://en.wikipedia.org/wiki/Maslow%27s_hierarchy">Maslow needs</a>; to express themselves, to explore and share ideas (me), to make a name for themselves on a wider stage (probably me, and most of you who are reading this.) Or they may want to record their experiences and share news with friends and relatives (travel bloggers, for example, and some mommy bloggers.) </p>
<p>Or perhaps their blog simply gives them a better platform than email to share the amusing/interesting/shocking/useful stuff that they find on the web, a sort of digital scrapbook.</p>
<h3>Why I think we need a typology</h3>
<p>It seems to me that &#8212; as marketers and communicators and public relations people and social media experts &#8212; we need to distinguish between different types of blogger quickly and easily, and that it would help to have some kind of shared language with which we can do this.</p>
<p>It will help when formulating plans and approaches. For example, and off the top of my head, right now one can probably approach journalist bloggers as though they were journalists, while it&#8217;s probably not worth approaching someone who&#8217;s mainly a CEO blogger. I look forward to being shot down on this, but would first refer readers to my esteemed colleague&#8217;s notes on <a href="http://porternovelli.typepad.com/pneo/2008/06/how-to-approach.html">how to approach a blogger</a>.</p>
<p>And it will help bring new people into the game. Currently clients and colleagues can be split easily into two groups, those who get it, and those who don&#8217;t<sup><a href="#1">1</a></sup>. We need a way to talk to the second group so that they can get it. A simpler, logical categorization is one of the things we need to help us do that.</p>
<h4>How many types are we looking for?</h4>
<p>Obviously it&#8217;s possible to have as many types of blogger as there are bloggers; the interesting thing is to try and keep this typology as <em>terse</em> as possible. <a href="http://www.immediatefuture.co.uk/option,com_glossary/Itemid,125/">Katy &#038; Co.&#8217;s valiant attempt</a> to categorize different types of blog makes great linkbait, but we should avoid the unnecessary proliferation of terms. Seven (or ideally four) types of blog would make for something memorable and easy to use.</p>
<h3>What we&#8217;re going to be doing</h3>
<p>At the moment, this is an unfunded project, but we&#8217;re not going to let <em>that</em> get in our way. We&#8217;re not looking to create an industry-standard after all, just something that is useful and usable by us and whoever wants to share it.</p>
<p>We&#8217;re going to collect a little qual data from as many bloggers as would like to take part. We&#8217;ve created a short test survey (five questions, with a total of 33 statements at this stage) which we would like you to answer. We&#8217;ll also use our own tools to collect a little quant data from your blog (typically we look at &#8220;recency, frequency, tenure and authority&#8221; figures.) We&#8217;ll make the (suitably anonymised) data available to whoever wants it on this blog.</p>
<p>Once we&#8217;ve begun to code and analyse the data to generate what look like statistically robust clusters of behaviour, we&#8217;d like to start interviewing a few of you to find out more about what makes <em>you</em> tick, and what works for your audience.</p>
<p>From that, I hope we can develop some kind of a simple-yet-strong typology that we can share with you. Nothing too complicated, nothing too clever. We&#8217;ll also refine our survey, remove lots of questions that don&#8217;t turn out to be particularly significant or good indicators of anything, add a carefully-tuned question here and there. Then &#8211; with any luck &#8211; we&#8217;ll go out and do the same exercise again, but this time with more respondents and more languages. Again, we&#8217;ll try to share data sets, and will certainly share findings.</p>
<h3>How you can help</h3>
<p>Please help us.</p>
<dl>
<dt><strong>Answer my survey</strong></dt>
<dd>If you have a blog of any kind, please <a href="http://www.surveymonkey.com/s.aspx?sm=mFca_2fPTMTqr2c690_2fSCPDQ_3d_3d">answer this short survey</a>. It should take less than five minutes to complete.</dd>
<p></br></p>
<dt><strong>Share the survey link</strong></dt>
<dd>Please share this link via twitter, email, or your blog</br><br />
<a href="http://icanhaz.com/blogger_survey">http://icanhaz.com/blogger_survey</a><br />
For us to get started on this, we&#8217;ll need <em>lots</em> of data; a few hundred respondents at least. We&#8217;ll be doing all sorts of things to get people to answer the survey, but the more you can help push it along the better.</dd>
<p></br></p>
<dt><strong>Share your thoughts</strong></dt>
<dd>Leave a comment below, write a follow-up post on your blog, or <a href="http://mediaczar.com/contact">drop me a line</a>. I&#8217;ll be dropping in and out of this subject for the foreseeable future, and I&#8217;ll try to write digests of what we&#8217;ve been discussing.</dd>
<p></br>
</dl>
<p>Thank you for reading all the way to the end. I hope that you&#8217;ve now got a better idea of what I&#8217;m trying to do (and why). Furthermore, I hope that you approve, and that you&#8217;ll try to help us out.</p>
<p><strong><a name="1">Note 1.</a></strong> Of course, this is a gross over-simplification. There&#8217;s at least one more group, &#8220;those who <em>think</em> they get it.&#8221; How can any of us be sure that we&#8217;re not in this group?</p>
]]></content:encoded>
			<wfw:commentRss>http://mediaczar.com/blog/2008/12/your-help-needed-to-develop-blogger-typology/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
	</channel>
</rss>

