<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>mediaczar &#187; hack</title>
	<atom:link href="http://mediaczar.com/blog/category/how-to/hack/feed/" rel="self" type="application/rss+xml" />
	<link>http://mediaczar.com/blog</link>
	<description>a blog by mat morrison</description>
	<lastBuildDate>Mon, 22 Mar 2010 12:12:27 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>A first stab at a perl script to create Twitter friend/follow matrices</title>
		<link>http://mediaczar.com/blog/2009/07/a-first-stab-at-a-perl-script-to-create-twitter-friendfollow-matrices/</link>
		<comments>http://mediaczar.com/blog/2009/07/a-first-stab-at-a-perl-script-to-create-twitter-friendfollow-matrices/#comments</comments>
		<pubDate>Tue, 14 Jul 2009 15:23:08 +0000</pubDate>
		<dc:creator>Mat Morrison</dc:creator>
				<category><![CDATA[hack]]></category>
		<category><![CDATA[twitter]]></category>
		<category><![CDATA[kludge]]></category>
		<category><![CDATA[network analysis]]></category>
		<category><![CDATA[networks]]></category>
		<category><![CDATA[perl]]></category>

		<guid isPermaLink="false">http://mediaczar.com/blog/?p=988</guid>
		<description><![CDATA[Geek alert: if the title of this post isn&#8217;t a dead giveaway I should tell you &#8212; unless you&#8217;re interested in APIs and badly-put-together bits of code &#8212; this probably isn&#8217;t for you. I&#8217;ve recently found myself using a service provided by Damon Clinkscale called DoesFollow. All it does is answer the simple question &#8220;does [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F07%2Fa-first-stab-at-a-perl-script-to-create-twitter-friendfollow-matrices%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F07%2Fa-first-stab-at-a-perl-script-to-create-twitter-friendfollow-matrices%2F&amp;source=mediaczar&amp;style=normal&amp;service=bit.ly&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><em>Geek alert: if the title of this post isn&#8217;t a dead giveaway I should tell you &#8212; unless you&#8217;re interested in APIs and badly-put-together bits of code &#8212; this probably isn&#8217;t for you.</em></p>
<p>I&#8217;ve recently found myself using a service provided by <a href="http://twitter.com/damon">Damon Clinkscale</a> called <a href="http://doesfollow.com/">DoesFollow</a>. All it does is answer the simple question &#8220;does twitter user A follow twitter user B?&#8221; Apart from a frill which lets you reverse the order of your question (&#8220;does twitter user B follow twitter user A?&#8221;) that&#8217;s all it does. You can even interrogate it from the address bar like this: <code><a href="http://doesfollow.com/barackobama/mediaczar">http://doesfollow.com/barackobama/mediaczar</a></code></p>
<p><a href="http://doesfollow.com/barackobama/mediaczar"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/07/doesfollow-300x100.jpg" alt="doesfollow" title="doesfollow" width="300" height="100" class="aligncenter size-medium wp-image-995" /></a></p>
<p>While I was thinking about how useful a service this is, I was suddenly struck by a moment of clarity. A lot of the research I&#8217;ve been doing could be simplified by something like this.<br />
<span id="more-988"></span><br />
Quite often I want to find out whether <a href="http://mediaczar.com/blog/tag/mp/">MPs</a> or <a href="http://mediaczar.com/blog/tag/congress/">congressmen</a> or <a href="http://mediaczar.com/blog/2008/12/some-twitter-social-network-analysis/">PR people</a> follow each other on Twitter.</p>
<p>The way that I&#8217;ve been doing this until now is </p>
<ol>
<li>make a list of the people who I&#8217;m interested in researching</li>
<li>for each person on that list, grab the list of <em>all</em> the Twitter people whom they follow</li>
<li>process the list so that only relationships between the people on the list show up</li>
</ol>
<p>If <em>all</em> I&#8217;m doing is checking to see who follows whom, then this is a horribly wasteful way of doing things. The Twitter API limits the number of calls one can make on it &#8212; so this wastage leads to things taking much longer.</p>
<p>If only I could cycle all the names I want to check through something like DoesFollow!</p>
<p>Well &#8211; it turns out that I can. And in theory it&#8217;s not much harder than using DoesFollow. The <a href="http://apiwiki.twitter.com/Twitter-API-Documentation">Twitter API</a> (which is what DoesFollow uses, after all) has a method called <code>friendship/exists</code>. All we have to do is send Twitter the following request: </p>
<p><code><a href="http://twitter.com/friendships/exists.xml?user_a=barackobama&#038;user_b=mediaczar">http://twitter.com/friendships/exists.xml?user_a=<strong>barackobama</strong>&#038;user_b=<strong>mediaczar</strong></a></code></p>
<p>and it will come back with the answer:</p>
<p><code>&lt;friends&gt;true&lt;/friends&gt; </code><br />
or<br />
<code>&lt;friends&gt;false&lt;/friends&gt;</code></p>
<h3>Kludge-y perl code</h3>
<p><a href="http://mediaczar.com/blog/wp-content/uploads/2009/07/poor-man-hot-water-heater.jpg"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/07/poor-man-hot-water-heater.jpg" alt="poor-man-hot-water-heater" title="poor-man-hot-water-heater" width="480" height="359" class="aligncenter size-full wp-image-999" /></a></p>
<p><em>(This fabulous picture courtesy of <a href="http://thereifixedit.com/">There, I Fixed It</a>)</em></p>
<p>So I tried to do this using <a href="http://pipes.yahoo.com/">Yahoo! Pipes</a>, but there are too many nested loops. You need to do something like this:</p>
<p><code><br />
get list of names</p>
<p>for each user_a (in list) {</p>
<ul>
for each user_b (in list) {</p>
<ul> does friendship exist</ul>
<p>     }</ul>
<p>}<br />
</code></p>
<p>There&#8217;s no easy way to get Pipes to do this, as far as I can see (I&#8217;ll keep trying, but if someone else can help, I&#8217;d be v. grateful.)</p>
<p>So I&#8217;ve pulled together a badly-written perl script to do the work for me. </p>
<h4>The script</h4>
<p>[code lang="perl"]<br />
#!/usr/bin/perl<br />
# checks the Twitter API to find the friendships between a list of usernames<br />
# this should really use the NEW API call that would let us halve the number<br />
# of API calls<br />
# author: Mat Morrison<br />
# date: Friday July 10, 2009<br />
use warnings;<br />
use LWP::Simple;<br />
# set up variables<br />
# we're just using a whitespace delimited list for the moment<br />
my @usernames = qw(kerrymg mediaczar timhoang titusbicknell);<br />
# let's build the matrix with a hash of hashes...<br />
# to begin with, we'll include diagonal values -<br />
# that is -- we'll check to see whether @mediaczar follows @mediaczar<br />
foreach $user_a(@usernames) {<br />
	foreach $user_b(@usernames) {<br />
	# we should put in a conditional clause that will check for the diagonal values<br />
	# and not bother checking whether someone is a friend of themselves...<br />
	$url = 'http://twitter.com/friendships/exists.xml?user_a='<br />
	.$user_a<br />
	.'&#038;user_b='<br />
	.$user_b;<br />
	# get XML file from Twitter -- it's an astonishingly simple XML file that reads<br />
	# <friends>true</friends><br />
	# or<br />
	# <friends>false</friends><br />
	# so we don't need to do much with it...<br />
	$follows = get $url;<br />
	  die 'Can\'t get $url' unless defined $follows;<br />
	# strip the tags - I'm using a generic "HTML stripping" regex<br />
	$follows =~ s/<(.|\n)+?>//g;<br />
	# we should probably convert "true" values to 1 and "false" values to zero or blank<br />
	# now let's push data into the matrix<br />
		 $matrix{$user_a}{$user_b} = $follows<br />
	}<br />
}<br />
# spit out the data as a tab-delimited table<br />
# print the top line first<br />
for $user_b ( keys %matrix ) {<br />
	print "\t$user_b";<br />
}<br />
# now print the values<br />
# they're all neatly arranged in the matrix so we<br />
# can just print them out sequentially<br />
for $user_a ( keys %matrix ) {<br />
    print "\n$source";<br />
    for $follows ( keys %{ $matrix{$user_a} } ) {<br />
		print "\t$matrix{$user_a}{$follows} ";<br />
    }<br />
}<br />
print "\n";<br />
[/code]</p>
<h4>Where next?</h4>
<p>Most of my thinking is included above in the code comments. An obvious mistake I&#8217;m making is checking to see whether, say, @mediaczar follows @mediaczar. That wastes <em>n</em> API calls per search. But a more serious mistake is <strong>not to be using the new <code>friendships/show</code> method</strong>. Because it tells you whether user A follows user B and whether user B follows user A at the same time, it would save me <em>lots</em> of API calls. How many lots? Well take a look at this.</p>
<p>This is what I&#8217;m doing at the moment &#8212; checking <em>each and every</em> cell in the matrix:</p>
<p><img align="center" src="http://img.skitch.com/20090714-njbkr7micbcgum5erj1dsxhc46.jpg" alt="clumsy API call matrix" /></p>
<p>This is what I&#8217;d be doing if I removed the diagonals:</p>
<p><img align="center" src="http://img.skitch.com/20090714-rk7r17nx1n491geim25meg1jb9.jpg" alt="Matrix with diagonals removed" /></p>
<p>And <em>this</em> is what I&#8217;d be doing if I used the newer API call:</p>
<p><img align="center" src="http://img.skitch.com/20090714-8rbx3dr3qe4ctm7ajp465ewj8w.jpg" alt="Matrix using the new API call" /></p>
<p>I had to look up <a href="http://www.curiousmath.com/index.php?name=News&#038;file=article&#038;sid=23">the formula</a> for working this out without colouring in little boxes. With a little tweaking (to prevent the diagonals from creeping back in), here it is:</p>
<p><code>((n-1)^2)+n-1)/2</code></p>
<p>So &#8212; for <a href="http://tweetcongress.org/parties">a list of congress people </a>(159 on twitter as at Tuesday July 14, 2009) that&#8217;d be <code>((156-1)^2-1+156)/2 = 12,090</code> API calls. Which is still a lot and will require some careful throttling, but (literally) not half as many as the 156^2 = 24,336 API calls that I&#8217;d need to run it as the script currently stands.</p>
<p>So &#8211; back to the drawing board for a while. I really can&#8217;t work out a programmatic way of doing this. Hmph.</p>
]]></content:encoded>
			<wfw:commentRss>http://mediaczar.com/blog/2009/07/a-first-stab-at-a-perl-script-to-create-twitter-friendfollow-matrices/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Counting Twitter followers</title>
		<link>http://mediaczar.com/blog/2009/01/counting-twitter-followers/</link>
		<comments>http://mediaczar.com/blog/2009/01/counting-twitter-followers/#comments</comments>
		<pubDate>Sat, 24 Jan 2009 18:29:32 +0000</pubDate>
		<dc:creator>Mat Morrison</dc:creator>
				<category><![CDATA[hack]]></category>
		<category><![CDATA[pipes]]></category>
		<category><![CDATA[twitter]]></category>
		<category><![CDATA[google spreadsheets]]></category>
		<category><![CDATA[twittercounter]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://mediaczar.com/blog/?p=594</guid>
		<description><![CDATA[TwitterCounter, the service that tells you how many people followed a given Twitter user on a given date (among other things) has an API &#8211; so I thought I&#8217;d take a look at it to see whether I could create a quick automated table of rankings. Here&#8217;s the simplest way to query the API: [code] [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Fcounting-twitter-followers%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Fcounting-twitter-followers%2F&amp;source=mediaczar&amp;style=normal&amp;service=bit.ly&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>TwitterCounter, the service that tells you how many people followed a given Twitter user on a given date (among other things) has an <a href="http://twittercounter.com/?inc=api">API</a> &#8211; so I thought I&#8217;d take a look at it to see whether I could create a quick automated table of rankings.</p>
<p>Here&#8217;s the simplest way to query the API:</p>
<p>[code]</p>
<p>http://twittercounter.com/api/?username=mediaczar&#038;output=xml</p>
<p>[/code]</p>
<p>Just cut and paste that into the address bar of your browser for example. Fairly simple. Change the username and you&#8217;ll get the data for a different user. Here&#8217;s what you get back from the API &#8212; an XML file with lots of rich meaty data:<br />
<span id="more-594"></span><br />
<textarea rows="10" cols="60">
<?xml version="1.0" encoding="UTF-8"?><twittercounter>
	<user_id>8206</user_id>
	<user_name>mediaczar</user_name>
	<followers_current>658</followers_current>
	<date_updated>2009-01-24</date_updated>
	<url>http://mediaczar.com</url>
	<avatar>69490155/q506317361_7156_normal.jpg</avatar>
	<follow_days>215</follow_days>
	<started_followers>169</started_followers>
	<growth_since>489</growth_since>
	<average_growth>2</average_growth>
	<tomorrow>660</tomorrow>
	<next_month>718</next_month>
	<followers_yesterday>652</followers_yesterday>
	<rank>16681</rank>
	<followers_2w_ago>169</followers_2w_ago>
	<growth_since_2w>489</growth_since_2w>
	<average_growth_2w>35</average_growth_2w>
	<tomorrow_2w>693</tomorrow_2w>
	<next_month_2w>1708</next_month_2w>
	<followersperdate>
	<date2009-01-10>480</date2009-01-10>
	<date2009-01-11>480</date2009-01-11>
	<date2009-01-12>481</date2009-01-12>
	<date2009-01-13>486</date2009-01-13>
	<date2009-01-14>498</date2009-01-14>
	<date2009-01-15>498</date2009-01-15>
	<date2009-01-16>514</date2009-01-16>
	<date2009-01-17>534</date2009-01-17>
	<date2009-01-20>594</date2009-01-20>
	<date2009-01-21>594</date2009-01-21>
	<date2009-01-22>616</date2009-01-22>
	<date2009-01-23>652</date2009-01-23>
	<date2009-01-24>658</date2009-01-24>
	</followersperdate>
</twittercounter>
</textarea></p>
<h3>First attempt: Google Spreadsheet&#8217;s importXML function</h3>
<p>As part of this project, I was particularly interested to explore Google Spreadsheet&#8217;s useful <strong>importXML</strong> function. Google lets you pull data out of an XML document anywhere on the web and put it into a spreadsheet cell. </p>
<p>There&#8217;s a pretty average bit of <a href="http://docs.google.com/support/bin/answer.py?answer=75507">support documentation</a> from Google on this function and a peculiarly hard-to-understand <a href="http://www.w3.org/TR/xpath">description of the XPATH</a> reference that you can read if you want to, but all you really need to know is that you can address any item in the document using a double slash like so: <em>//item_name</em>.</p>
<p>So if there&#8217;s an item called <em>followers_yesterday</em> in the document above and you want to access it</p>
<p>[code lang="xml"]<followers_yesterday>652</followers_yesterday><br />
[/code]</p>
<p>I can access it like this</p>
<p>[code]=ImportXML("http://twittercounter.com/api/?output=xml&#038;username=mediaczar","//followers_yesterday")[/code]</p>
<p>So with not much work, I can create a <a href="http://spreadsheets.google.com/ccc?key=p4QDp5UmTKxQVVA_RP4hvNw">Google spreadsheet</a> that will do this for a list of twitter usernames</p>
<p><iframe width='500' height='300' frameborder='0' src='http://spreadsheets.google.com/pub?key=p4QDp5UmTKxQVVA_RP4hvNw&#038;output=html&#038;gid=0&#038;single=true&#038;range=A1:B51'></iframe></p>
<p>Wahey! That&#8217;s great! But what&#8217;s this?</p>
<p><a href="http://mediaczar.com/blog/wp-content/uploads/2009/01/google-error.jpg"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/google-error.jpg" alt="Google doesn&#039;t let you make more than 50 importXML calls in a single spreadsheet" title="Google doesn&#039;t let you make more than 50 importXML calls in a single spreadsheet" width="484" height="198" class="aligncenter size-full wp-image-604" /></a></p>
<p>I&#8217;ve got more than 200 people on the list I want to rank. Foiled by this apparent petty-mindedness on Google&#8217;s part, I decided to try again, this time using Yahoo! Pipes.</p>
<h3>Second Attempt: Yahoo! Pipes</h3>
<p>First of all I make <a href="http://pipes.yahoo.com/mediaczar/twittercounterapicall">a simple pipe</a> to build the URL and make the API call for a single user (I&#8217;ve learned to follow this modular approach from looking at pipes built by the Open University&#8217;s <a href="http://ouseful.wordpress.com/">Tony Hirst</a>.)</p>
<p><a href="http://mediaczar.com/blog/wp-content/uploads/2009/01/twittercounter-api-call.jpg"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/twittercounter-api-call.jpg" alt="Using Yahoo! Pipes to call the TwitterCounter API" title="Using Yahoo! Pipes to call the TwitterCounter API" width="344" height="475" class="aligncenter size-full wp-image-609" /></a></p>
<p>Now all I need to do is plug this into another pipe. I&#8217;m pulling the data from a Google spreadsheet list of Twitter usernames <a href="http://spreadsheets.google.com/pub?key=p4QDp5UmTKxS65ROEO5ykZQ&#038;output=txt">published as a text file</a>. Whenever I update the spreadsheet, of course, the text file is updated dynamically.</p>
<p><a href="http://mediaczar.com/blog/wp-content/uploads/2009/01/building-the-pipe-stage-1.jpg"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/building-the-pipe-stage-1.jpg" alt="Adding the API call into a loop on a new pipe" title="Adding the API call into a loop on a new pipe" width="471" height="437" class="aligncenter size-full wp-image-610" /></a></p>
<p>The first FetchCSV module pulls that text file in and passes the data to the next module as &#8220;username.&#8221; The Loop module loops through each username on the list and passes it through the TwitterCounter API Call module I just built. All the XML data from the call will be passed on as &#8220;item.data&#8221; Now all I need to do is select the bits I want and format the data for output. Should be simple.</p>
<p><a href="http://mediaczar.com/blog/wp-content/uploads/2009/01/rename-and-output.jpg"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/rename-and-output.jpg" alt="Rename the elements and output the data" title="Rename the elements and output the data" width="421" height="209" class="aligncenter size-full wp-image-612" /></a></p>
<p>The most important thing to know about pipes is that we&#8217;re usually working with RSS (there are ways around this.) That limits us to only a few fieldnames; generally title, link and description. </p>
<p>Here I&#8217;m using the Rename module to set
<pre>username</pre>
<p> as the title, and
<pre>item.data.followers_yesterday</pre>
<p> as the description. Let&#8217;s save the pipe and test it out. </p>
<p><a href="http://mediaczar.com/blog/wp-content/uploads/2009/01/running-the-pipe.jpg"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/running-the-pipe-300x192.jpg" alt="Running the Pipe. Success!" title="Running the Pipe. Success!" width="300" height="192" class="aligncenter size-medium wp-image-613" /></a></p>
<p>Success! Now I can take the <a href="http://pipes.yahoo.com/pipes/pipe.run?_id=46ab0eafe320dfdbf3736762be9797f9&#038;_render=rss">RSS feed</a> for this and (using Googles importRSS function) drop it back into another spreadsheet.</p>
<p>Only right now, I can&#8217;t work out how to to do that for more than twenty items. So I&#8217;ve actually got fewer results than I had before.</p>
<p>I&#8217;ll continue plugging away at this and post any solution I come to here  &#8211; but in the meantime if you&#8217;re reading this and know how to fix my issue with Google Spreadsheets, please help. There&#8217;s some hope held out by <a href="http://sphinn.com/story/50996">this article on Sphinn</a> which I&#8217;ll struggle along with.</p>
<p>In the meantime, I can output the results as a <a href="http://pipes.yahoo.com/pipes/pipe.run?_id=46ab0eafe320dfdbf3736762be9797f9&#038;_render=csv">CSV</a> so that&#8217;s OK.</p>
]]></content:encoded>
			<wfw:commentRss>http://mediaczar.com/blog/2009/01/counting-twitter-followers/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Automating Marshall Kirkpatrick&#8217;s &#8220;Social Media Cheatsheet&#8221; process with Yahoo! Pipes</title>
		<link>http://mediaczar.com/blog/2009/01/automating-marshall-kirkpatricks-social-media-cheetsheet-process-with-yahoo-pipes/</link>
		<comments>http://mediaczar.com/blog/2009/01/automating-marshall-kirkpatricks-social-media-cheetsheet-process-with-yahoo-pipes/#comments</comments>
		<pubDate>Sun, 11 Jan 2009 13:44:21 +0000</pubDate>
		<dc:creator>Mat Morrison</dc:creator>
				<category><![CDATA[hack]]></category>
		<category><![CDATA[how to]]></category>
		<category><![CDATA[pipes]]></category>
		<category><![CDATA[bloggers]]></category>
		<category><![CDATA[readwriteweb]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[yahoo]]></category>

		<guid isPermaLink="false">http://mediaczar.com/blog/?p=380</guid>
		<description><![CDATA[Marshall Kirkpatrick has published an excellent process for getting up to speed with what the big issues are in your market sector. Is there, he asks: any way to ramp up your knowledge of these fields, fast, other than the &#8220;Google and wander&#8221; method? He then outlines an almost perfect example of how to use [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Fautomating-marshall-kirkpatricks-social-media-cheetsheet-process-with-yahoo-pipes%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Fautomating-marshall-kirkpatricks-social-media-cheetsheet-process-with-yahoo-pipes%2F&amp;source=mediaczar&amp;style=normal&amp;service=bit.ly&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><div id="attachment_381" class="wp-caption alignright" style="width: 182px"><a href="http://pipes.yahoo.com/mediaczar/socialmediacheatsheet"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/marshall-plan-pipe-172x1024.png" alt="Yahoo! Pipe for automating Marshall Kirkpatrick&#039;s Social Media CheatSheet process" title="Yahoo! Pipe for automating Marshall Kirkpatrick&#039;s Social Media CheatSheet process" width="172" height="1024" class="size-large wp-image-381" /></a><p class="wp-caption-text">Yahoo! Pipe for automating Marshall Kirkpatrick's Social Media CheatSheet process</p></div>Marshall Kirkpatrick has published <a href="http://www.readwriteweb.com/archives/how_to_build_a_social_media_cheat_sheet.php">an excellent process</a> for getting up to speed with what the big issues are in your market sector. Is there, he asks:</p>
<blockquote><p>any way to ramp up your knowledge of these fields, fast, other than the &#8220;Google and wander&#8221; method?</p></blockquote>
<p>He then outlines an almost perfect example of how to use social media to do this.</p>
<p>You should <a href="http://www.readwriteweb.com/archives/how_to_build_a_social_media_cheat_sheet.php">read his article</a> before reading any further. It&#8217;s short and punchy and won&#8217;t take much time.</p>
<p>Read it? Good. Now you may have noticed in the comments section that the first commenter doubts that you can:</p>
<blockquote><p>find one baker or candlestick maker that will go through all of that.</p></blockquote>
<p>So I thought I&#8217;d see if I can automate the process. The short answer is that I can and I can&#8217;t. I can&#8217;t yet automate one or two really important bits and pieces, notably:</p>
<ol>
<li>ranking delicious bookmarks by popularity, not recency</li>
<li>human editorial selection of bookmarks</li>
</ol>
<div style="float:right; width:100px; vertical-align: text-top;"><script type="text/javascript">
digg_url = 'http://digg.com/tech_news/Automating_Marshall_Kirkpatrick_s_Social_Media_Cheat_Sheet';
</script><br />
<script src="http://digg.com/tools/diggthis.js" type="text/javascript"></script> </div>
<p>Perhaps someone could help me with this.<br />
But otherwise, I&#8217;ve published this Yahoo! Pipe, <a href="http://pipes.yahoo.com/mediaczar/socialmediacheatsheet">Automating Marshall Kirkpatrick&#8217;s Social Media Cheatsheet Process</a> which automates 90% of the process, and may make it easier for the bakers and candlestickmakers.</p>
<p>All comments and &#8212; more importantly &#8212; suggestions and improvements gratefully received. </p>
<p><ins datetime="2009-01-12T00:26:41+00:00"><i>Monday, 12 Jan 2009 00:27: I&#8217;ve just added a bit to the pipe to list posts in descending order according to PostRank. Don&#8217;t know if this is useful</i></ins><br />
<!--diggZ=none--> </p>
]]></content:encoded>
			<wfw:commentRss>http://mediaczar.com/blog/2009/01/automating-marshall-kirkpatricks-social-media-cheetsheet-process-with-yahoo-pipes/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>The Technorati Authority Yahoo! Pipe</title>
		<link>http://mediaczar.com/blog/2009/01/technorati-authority-yahoo-pipe/</link>
		<comments>http://mediaczar.com/blog/2009/01/technorati-authority-yahoo-pipe/#comments</comments>
		<pubDate>Wed, 07 Jan 2009 19:52:33 +0000</pubDate>
		<dc:creator>Mat Morrison</dc:creator>
				<category><![CDATA[blogger typology]]></category>
		<category><![CDATA[hack]]></category>
		<category><![CDATA[how to]]></category>
		<category><![CDATA[pipes]]></category>
		<category><![CDATA[google docs]]></category>
		<category><![CDATA[google spreadsheet]]></category>
		<category><![CDATA[technorati]]></category>
		<category><![CDATA[yahoo]]></category>

		<guid isPermaLink="false">http://mediaczar.com/blog/?p=319</guid>
		<description><![CDATA[Over the holidays, I started playing with a new Yahoo! pipe to pull information from Technorati into a spreadsheet. The reasons why I wanted to do this are covered in this post about the quantitative analysis of blogs, and my eventual perl-based solution to the problem is covered in this post. The problem with the [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Ftechnorati-authority-yahoo-pipe%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Ftechnorati-authority-yahoo-pipe%2F&amp;source=mediaczar&amp;style=normal&amp;service=bit.ly&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<div id="attachment_365" class="wp-caption alignnone" style="width: 394px"><a href="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipe.png"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipe.png" alt="Yahoo! Pipe to pull Technorati API data for multiple blogs" title="Yahoo! Pipe to pull Technorati API data for multiple blogs" width="384" height="526" class="size-full wp-image-365" /></a><p class="wp-caption-text">Yahoo! Pipe to pull Technorati API data for multiple blogs</p></div><br />
Over the holidays, I started playing with a new Yahoo! pipe to pull information from Technorati into a spreadsheet. The reasons why I wanted to do this are covered in this post about the <a href="http://mediaczar.com/blog/2009/01/blogger-typology-quantitative-analysis-step-1/">quantitative analysis of blogs</a>, and my eventual perl-based solution to the problem is covered in <a href="http://mediaczar.com/blog/2009/01/a-simple-perl-script-to-interrogate-the-technorati-api/comment-page-1/#comment-528">this post.</a></p>
<p>The problem with the perl-based approach is that it&#8217;s a little inaccessible to people who aren&#8217;t comfortable using a command line environment. So I really wanted to make something that more people would feel comfortable using, and perhaps play around with.</p>
<p>So, with some help and kind words from <a href="http://www.semdevel.com">Bob Briski</a>, one of whose pipes I&#8217;d stumbled across and <a href="http://delicious.com/mediaczar/pipes">bookmarked</a> during my research for this project, I decided to finish off the pipe and publish it so that others could use it, or (better still) improve upon it.<br />
<span id="more-319"></span></p>
<h3>What does it do?</h3>
<p>The pipe first pulls a list of blog URLs from a Google spreadsheet, then checks Technorati to get the &#8220;inbound blogs&#8221; count for each blog, then outputs the blog title, link and count to an RSS feed. </p>
<p>Technorati Authority (a useful metric when ranking a list of blogs) is calculated as &#8220;the unique blogs linking to a blog over the past six months&#8221; &#8212; in a way it&#8217;s a measure of how many readers may be influenced by a blogger <em>at one remove</em>.</p>
<h3>Before you start</h3>
<p>You&#8217;ll need:</p>
<ol>
<li>An API key from Technorati</li>
<li>A Yahoo! account</li>
<li>A Google account</li>
</ol>
<p>I&#8217;ve done all the rest, so you could simply clone my pipe, which <a href="http://pipes.yahoo.com/mediaczar/technorati_api_google_spreadsheet ">you&#8217;ll find here</a>. </p>
<h3>Publishing a Google spreadsheet as a CSV source</h3>
<p>In my first pass at this challenge, I used a simple text list of URLs stored on this server. That&#8217;s pretty easy if you have (a) a server, (b) a text editor, and (c) some way &#8212; like ftp or ssh &#8212; of getting the two to talk to each other. But lots of people don&#8217;t have servers these days, and still more probably have no idea what I mean by (c), so I decided to follow a new approach before writing this article. <em>Everyone</em>, I thought, must have access to Google Docs.</p>
<p>So here&#8217;s my test list as published on Google:<br />
<iframe width='500' height='150' frameborder='0' src='http://spreadsheets.google.com/pub?key=p4QDp5UmTKxTf0FacFsewJw&#038;output=html&#038;gid=0&#038;single=true&#038;range=A1:A7'></iframe></p>
<p>Now all I have to do is <strong>Share > Publish as a web page</strong>, then click <strong>More publishing options</strong> and select &#8216;CSV&#8217; or &#8216;TXT&#8217; (1) as the publishing option to create a simple text list.</p>
<p><div id="attachment_328" class="wp-caption alignnone" style="width: 494px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/publishing-google-spreadsheet-as-a-csv.jpg" alt="publishing google spreadsheet as a csv" title="publishing google spreadsheet as a csv" width="484" height="643" class="size-full wp-image-328" /><p class="wp-caption-text">publishing google spreadsheet as a csv</p></div>
<p>You can <a href="http://spreadsheets.google.com/pub?key=p4QDp5UmTKxTf0FacFsewJw&#038;output=csv">see the CSV file here</a>. You need to copy (or make a note of) the link (2) because we use it as the seed for our pipe.</p>
<p> One of the nice things about using Google Docs is that it&#8217;s easy to edit, view and share the list. For whatever reason, you may not want to share your list of blogs so if you&#8217;d like the original version with the server-based file which is more &#8216;private&#8217;, <a href="http://pipes.yahoo.com/mediaczar/technoratiapiauthorityquery">it&#8217;s still available</a>.</p>
<h3>Setting up the Pipe</h3>
<p>The first thing to do is to build a URL for each request I&#8217;m going to send to the API. Technorati takes requests in the following format:<br />
<code>http://api.technorati.com/bloginfo?key=[apikey]&#038;url=[blog url]</code></p>
<div id="attachment_322" class="wp-caption alignnone" style="width: 645px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipes-set-up1.jpg" alt="setting up the pipe" title="setting up the pipe" width="635" height="511" class="size-full wp-image-322" /><p class="wp-caption-text">setting up the pipe</p></div>
<p>The pipe takes two variables that will need to be customized for your own version. One, rather obviously, is the link URL for the Google CSV that you copied or made a note of above (you <em>did</em> make a note of it, didn&#8217;t you?). The other is your <a href="http://technorati.com/developers/apikey.html">Technorati API key</a> for which you&#8217;ll probably need to <a href="http://technorati.com/developers/apikey.html">sign up</a>. It&#8217;s free, but is limited to 500 calls to Technorati&#8217;s database per day.</p>
<p>I&#8217;ve plugged the link into the <strong>Fetch CSV</strong> module at top left, and the API key into the <strong>Private String</strong> module. I use this module so that I can protect my API key &#8212; no-one else should be able to see it now.</p>
<p>The <strong>URL Builder</strong> module placed within a <strong>Loop</strong> module does this. I&#8217;ve hooked the API Key from the <strong>Private String</strong> into the parameter named &#8216;key&#8217;, and the output of the <strong>Fetch CSV</strong> module into the &#8216;url&#8217; parameter.</p>
<h3>Getting the data from Technorati</h3>
<div id="attachment_323" class="wp-caption alignnone" style="width: 463px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipes_-get-xml1.jpg" alt="getting the XML by calling technorati&#039;s API" title="getting the XML by calling technorati&#039;s API" width="453" height="291" class="size-full wp-image-323" /><p class="wp-caption-text">getting the XML by calling technorati's API</p></div>
<p>Now it&#8217;s simply a matter of looping through the requests that I built during the set-up stage above, and sending each one to Technorati so that I can fetch the data. Technorati responds with an XML file that looks like this:</p>
<pre>[code lang="xml"]
<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Technorati API version 1.0" -->
<!DOCTYPE tapi PUBLIC "-//Technorati, Inc.//DTD TAPI 0.02//EN"
"http://api.technorati.com/dtd/tapi-002.xml">
<tapi version="1.0">
<document>
    <result>
        <url>http://www.mediaczar.com/blog</url>
             <weblog>
                <name>Mediaczar</name>
                <url>http://mediaczar.com/blog</url>
                <rssurl></rssurl>
                <atomurl>http://feeds.feedburner.com/mediaczar/posts</atomurl>
                <inboundblogs>14</inboundblogs>
                <inboundlinks>21</inboundlinks>
                <lastupdate>2009-01-06 12:45:23 GMT</lastupdate>
                <rank>422833</rank>
                <authors>
                    <author>
                        <username>mediaczar</username>
                        <name>Mat Morrison</name>
                        <description>Mat Morrison is a digital marketing and
                        communications strategist with over a decade's
                        experience in online advertising, eCRM, and social
                        media.</description>
                        <url>http://technorati.com/people/technorati/mediaczar</url>
<photourl>http://static.technorati.com/progimages/photo.jpg?
                        uid=139758</photourl>
                    </author>
                </authors>
             </weblog>
                <inboundblogs>14</inboundblogs>
                <inboundlinks>21</inboundlinks>
    </result>
</document>
</tapi>
[/code]
</pre>
<p>I can choose which bit of the XML file I want to receive by setting the &#8216;path to item list&#8217; parameter in the <strong>Fetch Data</strong> module. You&#8217;ll see that I&#8217;ve set it to &#8216;document.result&#8217; &#8212; compare that to the XML example above and I think you&#8217;ll see how this works: I&#8217;ve removed the wrapper information from the file and gone straight to the meat. </p>
<h3>Renaming the fields to suit RSS output</h3>
<p><div id="attachment_326" class="wp-caption alignnone" style="width: 657px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipes_-rename-and-output.jpg" alt="renaming the fields to suit RSS output" title="renaming the fields to suit RSS output" width="647" height="398" class="size-full wp-image-326" /><p class="wp-caption-text">renaming the fields to suit RSS output</p></div><br />
Yahoo! Pipes output is limited to RSS, so we need to reformat the Technorati XML as RSS. Now Bob&#8217;s shown me how to do it, it&#8217;s pretty straightforward: RSS needs a <strong>title</strong>, <strong>link</strong>, and <strong>description</strong>. So I&#8217;m using the <strong>Rename</strong> module to do just that, and choosing the &#8216;item.inboundblogs&#8217; parameter as the description. I plug the results into the <strong>Pipe Output</strong>, and I&#8217;m ready to go. Save the pipe.</p>
<h3>Checking the output of the Pipe</h3>
<p>Now I&#8217;ve saved the pipe, I can run it to see if it&#8217;s working.<br />
<div id="attachment_325" class="wp-caption alignnone" style="width: 697px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipes_-output.jpg" alt="checking the output of the pipe" title="checking the output of the pipe" width="687" height="420" class="size-full wp-image-325" /><p class="wp-caption-text">checking the output of the pipe</p></div><br />
Sure enough, there&#8217;s the Authority figure coming through nice and clearly (1). Success! But there&#8217;s something more we can do. Clicking on the <strong>More options</strong> link (2) gives us an opportunity to take the RSS feed back into other tools like a feedreader, or into NetVibes (I&#8217;m a big fan.) Or, given that we started in Google, we could take it back there&#8230;</p>
<h3>Importing the RSS feed back into the Google Spreadsheet</h3>
<p><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipes_-importfeed1.jpg" alt="Google&#039;s &quot;importFeed&quot; function" title="Google&#039;s &quot;importFeed&quot; function" width="780" height="131" class="size-full wp-image-324" /><br />
Google Spreadsheets has some useful functions that you don&#8217;t find in Excel; and which are more geared towards the web. The only one that I&#8217;ve really played with is the <strong>importFeed</strong> function which &#8212; as you might expect &#8212; imports RSS feeds into your spreadsheet. So here, we paste the RSS link from the pipe into the sheet&#8230;<br />
<div id="attachment_337" class="wp-caption alignnone" style="width: 437px"><img src="http://mediaczar.com/blog/wp-content/uploads/2009/01/pipes_google-spreadsheet-w-data-1.jpg" alt="google spreadsheet showing output from importFeed function" title="google spreadsheet showing output from importFeed function" width="427" height="269" class="size-full wp-image-337" /><p class="wp-caption-text">google spreadsheet showing output from importFeed function</p></div><br />
&#8230; and it fills in our sheet for us. Now I can change the list of blogs on one sheet, and (as if by magic) the results will appear on the other, without my needing to go into Pipes. I&#8217;ve used Yahoo! Pipes to link one Google Spreadsheet to another by way of Technorati. Fun? I should say so. Useful? Most certainly.</p>
]]></content:encoded>
			<wfw:commentRss>http://mediaczar.com/blog/2009/01/technorati-authority-yahoo-pipe/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>A simple perl script to interrogate the Technorati API</title>
		<link>http://mediaczar.com/blog/2009/01/a-simple-perl-script-to-interrogate-the-technorati-api/</link>
		<comments>http://mediaczar.com/blog/2009/01/a-simple-perl-script-to-interrogate-the-technorati-api/#comments</comments>
		<pubDate>Sat, 03 Jan 2009 19:34:05 +0000</pubDate>
		<dc:creator>Mat Morrison</dc:creator>
				<category><![CDATA[blogger typology]]></category>
		<category><![CDATA[hack]]></category>
		<category><![CDATA[how to]]></category>
		<category><![CDATA[api]]></category>
		<category><![CDATA[bloggers]]></category>
		<category><![CDATA[blogs]]></category>
		<category><![CDATA[perl]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[technorati]]></category>

		<guid isPermaLink="false">http://mediaczar.com/blog/?p=220</guid>
		<description><![CDATA[Sometimes (for instance when I&#8217;m doing the research for the blogger typology) you need to get a whole load of Technorati data for a whole load of blogs. This research can (of course) be done by hand. And (of course) for a long list of blogs this would take a great deal of time. Handily, [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Fa-simple-perl-script-to-interrogate-the-technorati-api%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2009%2F01%2Fa-simple-perl-script-to-interrogate-the-technorati-api%2F&amp;source=mediaczar&amp;style=normal&amp;service=bit.ly&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><a href="http://www.flickr.com/photos/porternovelli/3163377107/sizes/o/" title="Technorati API perl query in action by matmorrison, on Flickr"><img src="http://farm4.static.flickr.com/3092/3163377107_135a68325a.jpg" width="500" height="167" alt="Technorati API perl query in action" /></a></p>
<p>Sometimes (for instance when I&#8217;m doing the research for the <a href="http://mediaczar.com/blog/2008/12/your-help-needed-to-develop-blogger-typology/">blogger typology</a>) you need to get a whole load of Technorati data for a whole load of blogs.</p>
<p>This research can (of course) be done by hand. And (of course) for a long list of blogs this would take a great deal of time. Handily, Technorati provides developers with <a href="http://technorati.com/developers/api/">an API that lets you automate those queries</a>. An API (for those of you who don&#8217;t know) is an <em>Application Programming Interface</em> &#8211; a toolkit provided by a service or application (in this case by Technorati) that lets other computer applications ask it questions and use the answers for their own purposes. It may be helpful to think of APIs as being like the knobs on top of a Lego brick that let you stick other Lego on to it without in any way changing the nature of the brick itself. On the other hand it may not be so helpful after all.<br />
<span id="more-220"></span><br />
After much struggling with a Yahoo! Pipe to <a href="http://pipes.yahoo.com/mediaczar/technoratiapiauthorityquery">query the Technorati API for a list of blogs</a>, I was forced to abandon my attempt. I would have liked to have shared that Pipe with the world (if you&#8217;re good with Yahoo! Pipes, do please take a look at it and see if you can help me!) <ins datetime="2009-01-06T23:11:07+00:00">[Tuesday January 6, 2009: Thanks to help and encouragement from <a href="http://www.semdevel.com">Bob Briski</a>, this now looks like it's on its way to working!]</ins></p>
<p>Instead, I&#8217;ve written a perl script to do this. Perl isn&#8217;t as easy for people to use for themselves as Pipes, but if you are comfortable with a command prompt, then you&#8217;re half way there.</p>
<p>What this script does is take a list of blog urls, and for each item in the list queries Technorati for the following information:</p>
<ol>
<li>Blog title</li>
<li>Inbound blogs (the number of unique external blogs linking to the blog over the past six months, this is also known as &#8220;Technorati Authority&#8221;)</li>
<li>Inbound links (the total number of links into the site)</li>
<li><a href="http://technorati.com/help/faq.html#ranking">Technorati Rank</a> (a sort of overall score)
</ol>
<h3>The script</h3>
<p>[code lang="perl"]#!/usr/bin/perl<br />
# use modules<br />
use LWP::Simple;<br />
use XML::Simple;<br />
# set up variables<br />
open(INFILE,  $ARGV[0]) or die "Can't open list of blogs to read: $!";<br />
$apikey='enter your Technorati API key here';<br />
# create object<br />
$xml = new XML::Simple;<br />
# read each line, and make the Technorati API call<br />
while (<INFILE>) {<br />
	chomp;<br />
	&#038;callTechnoratiAPI;<br />
}<br />
sub callTechnoratiAPI {<br />
	$url = 'http://api.technorati.com/bloginfo?format=xml&#038;key='.$apikey.'&#038;url='.$_;<br />
	# get XML file from Technorati<br />
	$content = get $url;<br />
	  die "Can't get $url" unless defined $content;<br />
	# read XML file<br />
	$data = $xml->XMLin($content);<br />
	# access XML data and print TSV to screen<br />
	# (you can fiddle with this as much<br />
	# or as little as you like)<br />
	print ""$data->{document}->{result}->{weblog}->{name}"t";<br />
	print "$data->{document}->{result}->{url}t";<br />
	print "$data->{document}->{result}->{weblog}->{inboundblogs}t";<br />
	print "$data->{document}->{result}->{weblog}->{inboundlinks}t";<br />
	print "$data->{document}->{result}->{weblog}->{rank}n";<br />
}[/code]</p>
<h3>How to use it</h3>
<p>I can&#8217;t give you any real advice on how to run perl on your system. If you want to play around with it, Macs come with perl already installed, Windows users should download and install the free <a href="http://www.activestate.com/activeperl/">ActivePerl</a>. But you&#8217;ll need to install the perl bundle <a href="http://search.cpan.org/~grantm/XML-Simple-2.18/lib/XML/Simple.pm">XML::Simple</a>, and I don&#8217;t know where to begin telling you how to do that if you don&#8217;t already know how perl and CPAN work. You see why I wanted to use Yahoo! Pipes?</p>
<p>If all of that doesn&#8217;t bother you, you&#8217;ll also need to sign up for a Technorati account (if you&#8217;re into this sort of thing, you should <em>already</em> have an account), and get your <a href="http://technorati.com/developers/apikey.html">free API key</a>. This key will let you make 500 queries in a 24-hour period, so you&#8217;ll need to plan how you use it.</p>
<p>The script as it&#8217;s listed above outputs tab-separated values to screen like this:<br />
<code>matm% ./parse_technorati.pl bloglist.txt<br />
"Chris Gilmour's Diary Vol. 14"	http://www.illandancient.blogspot.com	6	10	861604<br />
"The Red Rocket: Technology, PR and social media marketing"	www.theredrocket.co.uk	15	29	397843<br />
"Going Underground's Blog"	http://london-underground.blogspot.com	254	467	13332<br />
</code><br />
The blog&#8217;s title and url are followed in order by the inbound blogs (authority) count, the inbound links count, and the Technorati rank. </p>
<p>I use tab-separated values because that makes it simple to cut-and-paste directly into Excel or Google Spreadsheets for further analysis.</p>
<h3>Known bugs</h3>
<p>Right now, the script occasionally throws out something like this:</p>
<p><code>matm% ./parse_technorati.pl bloglist.txt<br />
"Lytham Villa"	http://lythamvilla.blogspot.com/	<strong>HASH(0x8ff7a0)	HASH(0x8ff7f4)</strong>	4978471<br />
"KickTime || A Driftless Regional Webspace"	http://kicktime.org	<strong>HASH(0x908e0c)	HASH(0x908db8)</strong>	1951828</code></p>
<p>I&#8217;ll work on this, but if anyone can point me in the right direction, I&#8217;ll be most grateful.</p>
]]></content:encoded>
			<wfw:commentRss>http://mediaczar.com/blog/2009/01/a-simple-perl-script-to-interrogate-the-technorati-api/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Referring to &#8220;this cell&#8221; using Excel conditional formatting</title>
		<link>http://mediaczar.com/blog/2008/12/referring-to-this-cell-using-excel-conditional-formatting/</link>
		<comments>http://mediaczar.com/blog/2008/12/referring-to-this-cell-using-excel-conditional-formatting/#comments</comments>
		<pubDate>Sun, 28 Dec 2008 18:40:11 +0000</pubDate>
		<dc:creator>Mat Morrison</dc:creator>
				<category><![CDATA[hack]]></category>
		<category><![CDATA[how to]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[conditional formatting]]></category>
		<category><![CDATA[excel]]></category>

		<guid isPermaLink="false">http://mediaczar.com/blog/?p=202</guid>
		<description><![CDATA[Since writing this post, three simpler, better ways of solving the problem have been submitted in the comments section. Feel free to read this post, but look to the comments for the solution! If you already know about conditional formatting and navigated here via Google, please jump straight to the hack. If not, I hope [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: left; margin-right: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2008%2F12%2Freferring-to-this-cell-using-excel-conditional-formatting%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fmediaczar.com%2Fblog%2F2008%2F12%2Freferring-to-this-cell-using-excel-conditional-formatting%2F&amp;source=mediaczar&amp;style=normal&amp;service=bit.ly&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p><ins datetime="2009-04-23T00:21:16+00:00"><strong><em>Since writing this post, three simpler, better ways of solving the problem have been submitted in the comments section. Feel free to read this post, but look to the comments for the solution!</em></strong></ins></p>
<p><strong><i>If you already know about conditional formatting and navigated here via Google, please <a href="#excel_hack">jump straight to the hack</a>. If not, I hope the following introduction is useful. You might also like to check out the <a href="http://www.wikihow.com/Apply-Conditional-Formatting-in-Excel">WikiHow</a> introduction to conditional formatting in Excel. This post is actually concerned with an interesting hack that lets you reference the value of a cell itself when setting up formula-based conditional formatting rule.</i></strong></p>
<h3>Conditional Formatting</h3>
<p>Excel&#8217;s <em>conditional formatting</em> feature is a boon to heavy spreadsheet users like me. It is a flexible and powerful tool that (among other things) lets me highlight data according to a set of rules so that I can easily spot the interesting bits in what would otherwise be an almost impossibly dense and meaningless cloud of numbers. Here&#8217;s an example; a table of the correlations between 32 different statements (taken from some ongoing work looking at a simple <a href="http://mediaczar.com/blog/2008/12/your-help-needed-to-develop-blogger-typology/">blogger typology</a>.)</p>
<p><a href="http://www.flickr.com/photos/porternovelli/3143864593/sizes/o/" title="Table of pairwise correlations between 32 statements by matmorrison, on Flickr"><img src="http://farm4.static.flickr.com/3109/3143864593_c1841d6743.jpg" width="500" height="196" alt="Table of pairwise correlations between 32 statements" /></a><br />
<span id="more-202"></span><br />
I&#8217;m sure you&#8217;d agree that your eyes would go squiggly if you looked for interesting data points in that mess (and that&#8217;s the <em>processed</em> table!) Instead, I use Excel&#8217;s conditional formatting to look through it for me. Here&#8217;s the same table with interesting points coloured red, orange, and yellow  (in decreasing order of interest.)</p>
<p><a href="http://www.flickr.com/photos/porternovelli/3143883761/sizes/o/" title="Table of pairwise correlation between 32 statements (with conditional formatting) by matmorrison, on Flickr"><img src="http://farm4.static.flickr.com/3242/3143883761_197d22f844.jpg" width="500" height="199" alt="Table of pairwise correlation between 32 statements (with conditional formatting)" /></a></p>
<p>Useful, eh? However, there are some very tight restrictions. Among them are:</p>
<ol>
<li>The rule must evaluate to TRUE or FALSE. Either the cell is more than x, for example, or it isn&#8217;t.</li>
<li>Like wishes, you can only apply <em>three</em> conditional-formatting rules to any cell.</li>
<li>Once Excel has evaluated a condition as TRUE it stops processing any further rules. It evaluates Condition 1, and if that&#8217;s FALSE will move on to Condition 2 and if that&#8217;s TRUE will never evaluate Condition 3 (however interesting Condition 3 may be.)</li>
</ol>
<h3>The Problem</h3>
<p>Here&#8217;s a much reduced version of the table of correlations example I used above. I still wouldn&#8217;t want to look through a table like this very often, but it&#8217;s easier to make my point when you can read the numbers.</p>
<p><a href="http://www.flickr.com/photos/porternovelli/3143771631/sizes/o/" title="Normal Conditional Formatting by matmorrison, on Flickr"><img src="http://farm4.static.flickr.com/3091/3143771631_d0daf4f0c9.jpg" width="482" height="223" alt="Normal Conditional Formatting" /></a></p>
<p>I did this by applying the following conditional formatting rules.</p>
<p><a href="http://www.flickr.com/photos/porternovelli/3144612914/sizes/o/" title="Normal Conditional Formatting by matmorrison, on Flickr"><img src="http://farm4.static.flickr.com/3105/3144612914_9a72425994.jpg" width="500" height="367" alt="Normal Conditional Formatting" /></a></p>
<p>For the sake of this analysis, any correlation of 0.5 or over is seen to be &#8220;possibly interesting&#8221;, 0.6 or over &#8220;interesting&#8221;, and 0.7 or over &#8220;really quite interesting indeed.&#8221; I should probably come up with a better scale before I share this stuff in future.) But as it happens, correlations can go two ways, <em>positive</em> (&#8220;people who agree with this statement tend to agree with that statement&#8221;) or <em>negative</em> (&#8220;people who agree with this statement tend to disagree with that statement&#8221;.) So I also want to highlight any correlation <em>lower than -0.5, -0.6, or -0.7</em>. You can see from the following screen grab that this isn&#8217;t working.</p>
<p><a href="http://www.flickr.com/photos/porternovelli/3144603574/sizes/o/" title="Normal Excel Conditional Formatting Showing &amp;quot;Omissions&amp;quot; by matmorrison, on Flickr"><img src="http://farm4.static.flickr.com/3196/3144603574_367c5656f4.jpg" width="482" height="223" alt="Normal Excel Conditional Formatting Showing &amp;quot;Omissions&amp;quot;" /></a></p>
<p>But I only have three rules. What am I to do?</p>
<p>As it happens, Excel lets one specify a formula as a rule rather than simply a cell value. So, for example, I could check to see if:</p>
<p><code>=OR(A1<= 0.5, A1>=0.5)</code></p>
<p>But I&#8217;d have to code the conditions for  each separate cell in the table. I am notoriously lazy and therefore unlikely to do something like this, particularly for the 32&#215;32 matrix in the first example.</p>
<p>What I need to do is refer to <em>&#8216;this cell&#8217;</em> in the formula. Now in normal Excel use this would lead to recursion, a circular reference, and therefore an error. For this logical reason, Excel&#8217;s developers never saw the need to let a cell refer to itself. <em>But that&#8217;s exactly what I need to do!</em></p>
<p><a name="excel_hack"><br />
<h3>The Excel Conditional Formatting Hack</h3>
<p></a><br />
After much faffing, I finally worked out a way around the problem.</p>
<p><code>=ABS(INDIRECT("R"&#038;ROW()&#038;"C"&#038;COLUMN(),FALSE))>=0.5</code></p>
<p><a href="http://www.flickr.com/photos/porternovelli/3143783915/sizes/o/" title="Hacked Conditional Formatting by matmorrison, on Flickr"><img src="http://farm4.static.flickr.com/3082/3143783915_7aab162c4a.jpg" width="500" height="369" alt="Hacked Conditional Formatting" /></a></p>
<p>Which you will see works very nicely.</p>
<p><a href="http://www.flickr.com/photos/porternovelli/3144593352/sizes/o/" title="Hacked Conditional Formatting in Excel by matmorrison, on Flickr"><img src="http://farm4.static.flickr.com/3107/3144593352_83f4b14baa.jpg" width="477" height="222" alt="Hacked Conditional Formatting in Excel" /></a></p>
<p>Let&#8217;s look at how it works (I&#8217;ve put the bit that matters in bold face.)</p>
<p><code>=ABS<strong>(INDIRECT("R"&#038;ROW()&#038;"C"&#038;COLUMN(),FALSE))</strong>>=0.5</code></p>
<p>That probably doesn&#8217;t help all that much. On the assumption that for many of you it&#8217;s the first time you&#8217;ve seen these functions, let&#8217;s pull it apart a bit.</p>
<p><code>INDIRECT</code> is a way of linking to a cell using a &#8220;reference text&#8221; &#8211; that is, a text string that <em>looks like</em> a cell reference (&#8220;A1&#8243; for example, or &#8220;R1C1&#8243; which is the same thing &#8211; pointing to Row 1 Column 1)</p>
<p>We actually employ it like this:<br />
<code>INDIRECT(ref_text,a1)</code><br />
Where <em>ref_text</em> is the &#8220;reference text&#8221; and <em>a1</em> is a TRUE or FALSE value that tells Excel whether to expect an A1-style reference (TRUE or omitted) or an R1C1-style reference (FALSE)</p>
<p><code>ROW()</code> tells Excel to insert the value for the current Row, and&#8230;<br />
<code>COLUMN()</code> tells Excel to insert the value for the current Column, so:</p>
<p><code>INDIRECT("R"&#038;ROW()&#038;"C"&#038;COLUMN(),FALSE)</code> gives us the R1C1-style reference to the current cell. <em>This</em> cell! Hooray!</p>
<h3>These didn&#8217;t work&#8230;</h3>
<p>Incidentally, I also tried these which I think <em>should</em> work but don’t. I include them for completeness’s sake.</p>
<p><code>=ADDRESS(ROW(),COLUMN())</code></p>
<p><code>=INDIRECT(ADDRESS(ROW(),COLUMN()),FALSE)>=0.5</code></p>
<p><code>=INDIRECT(TEXT(ADDRESS(ROW(),COLUMN())),FALSE)>=0.5</code></p>
]]></content:encoded>
			<wfw:commentRss>http://mediaczar.com/blog/2008/12/referring-to-this-cell-using-excel-conditional-formatting/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
	</channel>
</rss>

