<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Saturnboy &#187; flexmonkey</title>
	<atom:link href="http://saturnboy.com/tag/flexmonkey/feed/" rel="self" type="application/rss+xml" />
	<link>http://saturnboy.com</link>
	<description>Code, Work, and Life</description>
	<lastBuildDate>Thu, 01 Mar 2012 22:35:28 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Metrics and the AIR Install Badge</title>
		<link>http://saturnboy.com/2010/05/metrics-air-install-badge/</link>
		<comments>http://saturnboy.com/2010/05/metrics-air-install-badge/#comments</comments>
		<pubDate>Wed, 19 May 2010 15:21:01 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Work]]></category>
		<category><![CDATA[AIR]]></category>
		<category><![CDATA[flash]]></category>
		<category><![CDATA[flexmonkey]]></category>

		<guid isPermaLink="false">http://saturnboy.com/?p=1405</guid>
		<description><![CDATA[The AIR Install Badge is a very handy little flash application for delivering AIR applications to your users via the web. The badge allows your users to download and install both your application and the Adobe AIR runtime. Additionally, the install badge will automatically prompt users to upgrade if a previously installed version is detected. [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.adobe.com/devnet/air/articles/badge_for_air.html">AIR Install Badge</a> is a very handy little flash application for delivering AIR applications to your users via the web.  The badge allows your users to download and install both your application and the Adobe AIR runtime.  Additionally, the install badge will automatically prompt users to upgrade if a previously installed version is detected.  At <a href="http://www.gorillalogic.com/">Gorilla Logic</a>, we use the AIR Install Badge on the <a href="http://www.gorillalogic.com/flexmonkey/download">FlexMonkey download page</a> (free registration required).</p>
<p>Alas, flash is opaque to analytics.  We have no idea what our users are doing inside the AIR Install Badge application.  Are they installing? Or upgrading?  No problem, we just need to write some code&#8230;</p>
<h3>The Code</h3>
<p>Using flash&#8217;s <a href="http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/external/ExternalInterface.html">ExternalInterface</a>, we can manually push the data out of flash and into javascript.  Once we have the data in javascript, we have total control.  One option is to <a href="http://blog.creoff.net/using-google-analytics-with-adobe-air-install-badge/">use google analytics to store our badge data</a>.  In the case of FlexMonkey, we send the badge data along with the user&#8217;s credentials to our CRM platform, <a href="http://www.salesforce.com/">SalesForce.com</a>.</p>
<p class="bottom"><b>Step 1:</b> First, open <code>AIRInstallBadge.as</code> and add this to the top:</p>

<div class="wp_syntax"><div class="code"><pre class="actionscript" style="font-family:monospace;"><span style="color: #0066CC;">import</span> flash.<span style="color: #006600;">external</span>.<span style="color: #006600;">ExternalInterface</span>;</pre></div></div>

<p class="bottom"><b>Step 2:</b> Next, add the <code>ExternalInterface</code> call to the top of the <code>handleActinClick()</code> function in <code>AIRInstallBadge.as</code>:</p>

<div class="wp_syntax"><div class="code"><pre class="actionscript" style="font-family:monospace;">protected <span style="color: #000000; font-weight: bold;">function</span> handleActionClick<span style="color: #66cc66;">&#40;</span>evt:MouseEvent<span style="color: #66cc66;">&#41;</span>:<span style="color: #0066CC;">void</span> <span style="color: #66cc66;">&#123;</span>
    <span style="color: #b1b100;">if</span> <span style="color: #66cc66;">&#40;</span>action == <span style="color: #ff0000;">'install'</span> <span style="color: #66cc66;">||</span> action == <span style="color: #ff0000;">'upgrade'</span><span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
        <span style="color: #808080; font-style: italic;">//send data to js</span>
        ExternalInterface.<span style="color: #0066CC;">call</span><span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'badgeJS'</span>,action<span style="color: #66cc66;">&#41;</span>;
    <span style="color: #66cc66;">&#125;</span>
    ...
<span style="color: #66cc66;">&#125;</span></pre></div></div>

<p>Since I only care about the <code>install</code> or <code>upgrade</code> actions, I&#8217;ll only send those out to javascript.  Re-compile the badge and deploy.</p>
<p class="bottom"><b>Step 3:</b> Last, add the <code>badgeJS()</code> javascript callback to the page containing the badge and do whatever you want with the incoming badge data:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #003366; font-weight: bold;">function</span> badgeJS<span style="color: #009900;">&#40;</span>action<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #006600; font-style: italic;">//do metrics here...</span>
    <span style="color: #000066;">alert</span><span style="color: #009900;">&#40;</span><span style="color: #3366CC;">'badge action='</span> <span style="color: #339933;">+</span> action<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<h3>Conclusion</h3>
<p>With an hour of effort, and a very small amount of code, we&#8217;ve managed to get the useful metrics of installs and upgrades out of the AIR Install Badge and into our analytics engine of choice.  A job well done.</p>
]]></content:encoded>
			<wfw:commentRss>http://saturnboy.com/2010/05/metrics-air-install-badge/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Scraping Google Groups</title>
		<link>http://saturnboy.com/2010/03/scraping-google-groups/</link>
		<comments>http://saturnboy.com/2010/03/scraping-google-groups/#comments</comments>
		<pubDate>Mon, 29 Mar 2010 03:31:31 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Work]]></category>
		<category><![CDATA[flexmonkey]]></category>
		<category><![CDATA[php]]></category>

		<guid isPermaLink="false">http://saturnboy.com/?p=1105</guid>
		<description><![CDATA[When we launched the new and improved Gorilla Logic website, we decided to bring all our open source projects together under one roof. In order to migrate all things FlexMonkey back to our website, we need to get our forum data migrated out of Google Groups. Alas, Google doesn&#8217;t provide any way to export data [...]]]></description>
			<content:encoded><![CDATA[<p>When we launched the new and improved <a href="http://www.gorillalogic.com/">Gorilla Logic</a> website, we decided to bring all our open source projects together under one roof.  In order to migrate all things <a href="http://www.gorillalogic.com/flexmonkey">FlexMonkey</a> back to our website, we need to get our forum data migrated out of Google Groups.  Alas, Google doesn&#8217;t provide any way to export data from Google Groups.  The only way to preserve the amazing contributions from the FlexMonkey community was to scrape Google Groups.  So that&#8217;s just what we did.</p>
<p>With a very minimal amount of PHP, I was able to walk the entire FlexMonkey Google Group, scrap all the topics (aka threads) and all the posts inside each thread.  The first step was to build a generic scraper class that grabs an html page (using <a href="http://curl.haxx.se/">cURL</a>) and parses out all unique outbound links.</p>
<p class="bottom">Here&#8217;s the code for the <code>Scraper</code> class:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">class</span> Scraper <span style="color: #009900;">&#123;</span>
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000088;">$url</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000088;">$html</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000088;">$links</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> __construct<span style="color: #009900;">&#40;</span><span style="color: #000088;">$url</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">url</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$url</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> run<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">html</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">''</span><span style="color: #339933;">;</span>
        <span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">links</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #666666; font-style: italic;">//scrape url &amp; store html</span>
        <span style="color: #000088;">$ch</span> <span style="color: #339933;">=</span> <span style="color: #990000;">curl_init</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #990000;">curl_setopt</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$ch</span><span style="color: #339933;">,</span> CURLOPT_URL<span style="color: #339933;">,</span> <span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">url</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #990000;">curl_setopt</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$ch</span><span style="color: #339933;">,</span> CURLOPT_HEADER<span style="color: #339933;">,</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #990000;">curl_setopt</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$ch</span><span style="color: #339933;">,</span> CURLOPT_RETURNTRANSFER<span style="color: #339933;">,</span> <span style="color: #009900; font-weight: bold;">true</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">html</span> <span style="color: #339933;">=</span> <span style="color: #990000;">curl_exec</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$ch</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #990000;">curl_close</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$ch</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
         <span style="color: #666666; font-style: italic;">//parse html for all links</span>
        <span style="color: #000088;">$matches</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #990000;">preg_match_all</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'#&lt;a.*?href\s*=\s*&quot;(.*?)&quot;.*?&gt;(.*?)&lt;/a&gt;#i'</span><span style="color: #339933;">,</span> <span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">html</span><span style="color: #339933;">,</span> <span style="color: #000088;">$matches</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
        <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$matches</span> <span style="color: #339933;">!==</span> <span style="color: #009900; font-weight: bold;">false</span> <span style="color: #339933;">&amp;&amp;</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$matches</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">==</span> <span style="color: #cc66cc;">3</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #b1b100;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000088;">$i</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">0</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span> <span style="color: #339933;">&lt;</span> <span style="color: #990000;">count</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$matches</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> <span style="color: #000088;">$i</span><span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                <span style="color: #000088;">$href</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$matches</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
                <span style="color: #000088;">$val</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$matches</span><span style="color: #009900;">&#91;</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$i</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
&nbsp;
                <span style="color: #666666; font-style: italic;">//unique links</span>
                <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #339933;">!</span><span style="color: #990000;">array_key_exists</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$href</span><span style="color: #339933;">,</span> <span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">links</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
                    <span style="color: #000088;">$this</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">links</span><span style="color: #009900;">&#91;</span><span style="color: #000088;">$href</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$val</span><span style="color: #339933;">;</span>
                <span style="color: #009900;">&#125;</span>
            <span style="color: #009900;">&#125;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>In the <code>run()</code> method, cURL is used to grab the html.  Next, a regular expression is used to match all outbound links.  The links are are stored in a hash, while making sure they point to unique urls.</p>
<p class="bottom">Built on top of the generic <code>Scraper</code> class is a specialized Google Groups scraper class, aptly named <code>GoogleGroupsScraper</code>.  For a given Google Group, the url of the main page (containing a list of most recent topics) is:</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">http://groups.google.com/group/[GROUP]/topics</pre></div></div>

<p class="bottom">And the url of a single topic (aka thread) is:</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">http://groups.google.com/group/[GROUP]/browse_thread/thread/[THEAD_ID]#</pre></div></div>

<p>Where <code>[GROUP]</code> is the name of the Google Group, and <code>[THREAD_ID]</code> is some alphanumeric id.  Most importantly, at the bottom of the main page is an <u>Older &raquo;</u> link that points to the next page of topics.  The <code>GoogleGroupsScraper</code> exploits this to spider the entire group, recording topic title and topic url as it walks each page.</p>
<p>Next, each individual topic page is scraped by the <code>GoogleGroupsTopicScraper</code> class and parsed into a list of posts with author name, date, timestamp, etc.  The topic scraper uses various regular expressions to extract and massage the html to extract the different parts of each post.  In particular, the post body needs a lots of work to strip out any Google Groups specific links and code.</p>
<p>Lastly, the topics and their posts are assembled into an XML document with a nice big CDATA block around the post body to preserve the html content.</p>
<p class="bottom">Here&#8217;s some sample output from the scraper:</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;?xml</span> <span style="color: #000066;">version</span>=<span style="color: #ff0000;">&quot;1.0&quot;</span> <span style="color: #000066;">encoding</span>=<span style="color: #ff0000;">&quot;UTF-8&quot;</span><span style="color: #000000; font-weight: bold;">?&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;scrape</span> <span style="color: #000066;">group</span>=<span style="color: #ff0000;">&quot;flexmonkey&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;topic<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;title<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>FlexMonkey User Group is now located at www.gorillalogic.com/flexmonkey!<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/title<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;link<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>http://groups.google.com/group/flexmonkey/browse_thread/thread/fe9ed66bf56db88e#<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/link<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;posts<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;post</span> <span style="color: #000066;">idx</span>=<span style="color: #ff0000;">&quot;0&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;author<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Stu<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/author<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;email<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>stu.st...@gorillalogic.com<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/email<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;date<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>February 10, 2010 21:17:52 UTC<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/date<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;timestamp<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>1265836672<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/timestamp<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;body<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #339933;">&lt;![CDATA[</span>
<span style="color: #339933;">&lt;p&gt;People of FlexMonkey, &lt;p&gt;We have migrated the FlexMonkey discussion forum to &lt;a href=&quot;http://www.gorillalogic.com/flexmonkey&quot;&gt;http://www.gorillalogic.com/flexmonkey&lt;/a&gt;. Please note that you will need to re-subscribe to the new forum to continue receiving FlexMonkey discussion messages. &lt;p&gt;-Stu &lt;br&gt;</span>
<span style="color: #339933;">]]&gt;</span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/body<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/post<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/posts<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/topic<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;topic<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;title<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Record button clicks based on Ids instead of names?<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/title<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;link<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>http://groups.google.com/group/flexmonkey/browse_thread/thread/4f079b1959374f53#<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/link<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;posts<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;post</span> <span style="color: #000066;">idx</span>=<span style="color: #ff0000;">&quot;0&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;author<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Shilpa<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/author<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;email<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>shilpa.g...@gmail.com<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/email<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;date<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>February 9, 2010 23:44:44 UTC<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/date<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;timestamp<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>1265759084<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/timestamp<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;body<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>...<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/body<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/post<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;post</span> <span style="color: #000066;">idx</span>=<span style="color: #ff0000;">&quot;1&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;author<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Shilpa<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/author<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;email<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>shilpa.g...@gmail.com<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/email<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;date<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>February 10, 2010 00:05:44 UTC<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/date<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;timestamp<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>1265760344<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/timestamp<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;body<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>...<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/body<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/post<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;post</span> <span style="color: #000066;">idx</span>=<span style="color: #ff0000;">&quot;2&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;author<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Gokuldas K Pillai<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/author<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;email<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>gokul...@gmail.com<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/email<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;date<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>February 10, 2010 00:16:34 UTC<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/date<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;timestamp<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>1265760994<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/timestamp<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;body<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>...<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/body<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/post<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;post</span> <span style="color: #000066;">idx</span>=<span style="color: #ff0000;">&quot;3&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;author<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Shilpa<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/author<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;email<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>shilpa.g...@gmail.com<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/email<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;date<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>February 10, 2010 01:18:42 UTC<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/date<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;timestamp<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>1265764722<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/timestamp<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;body<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>...<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/body<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/post<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
...</pre></div></div>

<p class="bottom">Finally, there is a very simple PHP driver for the scraper that runs the scraping process:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #b1b100;">require_once</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'GoogleGroupsScraper.class.php'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$scraper</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> GoogleGroupsScraper<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'[GROUP]'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$scraper</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">run</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #b1b100;">print</span> <span style="color: #000088;">$scraper</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">getXML</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p class="bottom">And you run it as usual:</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">php scrape.php &gt; output.xml</pre></div></div>

<p>Just enter the name of the Google Group you wish to scrap, and away you go.  Here are a couple of notes to help you along:</p>
<ol>
<li><code>[GROUP]</code> is the group name as it appears in the url, so no spaces, etc.</li>
<li>It&#8217;s not fast, so be patient, or modify the scraper code to generate some intermediate output.</li>
<li>Via a browser, Google Group displays 30 topics per page, but via PHP &amp; cURL you only get 10.  Probably some Cookie or User Agent magic going on.</li>
<li>Not much error handling.   The error handling that exists isn&#8217;t very good. It will break.</li>
<li>Good luck!</li>
</ol>
<p>Please download the code and use it however you wish.  Hopefully, putting the code online and writing this post will save someone else some time when migrating data off Google Groups.</p>
<h5>Files</h5>
<ul>
<li><a href="http://saturnboy.com/proj/gorilla/scraper/GoogleGroupsScraper.tgz">GoogleGroupsScraper.tgz</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://saturnboy.com/2010/03/scraping-google-groups/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Testing async services with Fluint</title>
		<link>http://saturnboy.com/2009/03/fluint-async-testing/</link>
		<comments>http://saturnboy.com/2009/03/fluint-async-testing/#comments</comments>
		<pubDate>Mon, 23 Mar 2009 04:58:45 +0000</pubDate>
		<dc:creator>justin</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Work]]></category>
		<category><![CDATA[async]]></category>
		<category><![CDATA[flex]]></category>
		<category><![CDATA[flexmonkey]]></category>
		<category><![CDATA[fluint]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://saturnboy.com/?p=307</guid>
		<description><![CDATA[We&#8217;re pretty big on testing at Gorilla Logic, and in the world of Flex that usually means using FlexMonkey to test the UI and using FlexUnit to test the code. Alas, it is a huge pain in the ass to correctly test the many async objects and services inherent in any Flex app with FlexUnit. [...]]]></description>
			<content:encoded><![CDATA[<p>We&#8217;re pretty big on testing at <a href="http://www.gorillalogic.com/">Gorilla Logic</a>, and in the world of Flex that usually means using <a href="http://code.google.com/p/flexmonkey/">FlexMonkey</a> to test the UI and using <a href="http://opensource.adobe.com/wiki/display/flexunit/Flexunit">FlexUnit</a> to test the code.  Alas, it is a huge pain in the ass to correctly test the many async objects and services inherent in any Flex app with FlexUnit.  Enter <a href="http://code.google.com/p/fluint/">Fluint</a>, an superior Flex unit tesing framework by the cool guys at <a href="http://www.digitalprimates.net/">digital primates</a> (no relation).  Fluint is the heir apparent to take over the unit testing crown from the venerable FlexUnit.  So let&#8217;s take Fluint and its enhanced async testing support for a spin.</p>
<h5>Service Layer</h5>
<p class="bottom">First, assume we have a nice service layer in Flex that talks asynchronously to our backend.  Just something simple to start:</p>

<div class="wp_syntax"><div class="code"><pre class="actionscript3" style="font-family:monospace;"><span style="color: #0033ff; font-weight: bold;">public</span> <span style="color: #9900cc; font-weight: bold;">class</span> MyService <span style="color: #000000;">&#123;</span>
    <span style="color: #0033ff; font-weight: bold;">public</span> <span style="color: #339966; font-weight: bold;">function</span> getSomething<span style="color: #000000;">&#40;</span>result<span style="color: #000066; font-weight: bold;">:</span><span style="color: #004993;">Function</span><span style="color: #000066; font-weight: bold;">,</span> fault<span style="color: #000066; font-weight: bold;">:</span><span style="color: #004993;">Function</span><span style="color: #000000;">&#41;</span><span style="color: #000066; font-weight: bold;">:</span>AsyncToken <span style="color: #000000;">&#123;</span>
        <span style="color: #009900; font-style: italic;">//call the backend</span>
        <span style="color: #6699cc; font-weight: bold;">var</span> token<span style="color: #000066; font-weight: bold;">:</span>AsyncToken = backend<span style="color: #000066; font-weight: bold;">.</span>getSomething<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><span style="color: #000066; font-weight: bold;">;</span>
&nbsp;
        <span style="color: #009900; font-style: italic;">//wire the callbacks to the result</span>
        token<span style="color: #000066; font-weight: bold;">.</span>addResponder<span style="color: #000000;">&#40;</span><span style="color: #0033ff; font-weight: bold;">new</span> AsyncResponder<span style="color: #000000;">&#40;</span>result<span style="color: #000066; font-weight: bold;">,</span> fault<span style="color: #000066; font-weight: bold;">,</span> token<span style="color: #000000;">&#41;</span><span style="color: #000000;">&#41;</span><span style="color: #000066; font-weight: bold;">;</span>
&nbsp;
        <span style="color: #0033ff; font-weight: bold;">return</span> token<span style="color: #000066; font-weight: bold;">;</span>
    <span style="color: #000000;">&#125;</span>
<span style="color: #000000;">&#125;</span></pre></div></div>

<p>In this example, our service only has one method, <code>getSomething()</code> that takes two callback functions.  It simply calls the backend method, wires up the callbacks (which get called when the backend method returns a result), and returns the token.  It is <b>absolutely critical</b> that our callback-powered service method return the <code>AsyncToken</code>.  The reason for this will become apparent.</p>
<p class="bottom">We might use our service like this:</p>

<div class="wp_syntax"><div class="code"><pre class="mxml" style="font-family:monospace;"><span style="color: #000000;">&lt;?xml version=<span style="color: #ff0000;">&quot;1.0&quot;</span> encoding=<span style="color: #ff0000;">&quot;utf-8&quot;</span>?<span style="color: #7400FF;">&gt;</span></span>
<span style="color: #000000;"><span style="color: #7400FF;">&lt;mx:Application</span></span>
<span style="color: #000000;">        xmlns:mx=<span style="color: #ff0000;">&quot;http://www.adobe.com/2006/mxml&quot;</span></span>
<span style="color: #000000;">        creationComplete=<span style="color: #ff0000;">&quot;complete()&quot;</span><span style="color: #7400FF;">&gt;</span></span>
&nbsp;
    <span style="color: #339933;">&lt;mx:Script&gt;</span>
<span style="color: #339933;">    &lt;![CDATA[</span>
<span style="color: #339933;">        import com.saturnboy.services.MyService;</span>
<span style="color: #339933;">        private var service:MyService;</span>
&nbsp;
<span style="color: #339933;">        private function complete():void {</span>
<span style="color: #339933;">            service = new MyService();</span>
<span style="color: #339933;">            service.getSomething(resultHandler, faultHandler);</span>
<span style="color: #339933;">        }</span>
&nbsp;
<span style="color: #339933;">        private function resultHandler(result:Object, token:Object=null):void {</span>
<span style="color: #339933;">            lbl.text = result.result.name;</span>
<span style="color: #339933;">        }</span>
&nbsp;
<span style="color: #339933;">        public function faultHandler(error:Object, token:Object=null):void {</span>
<span style="color: #339933;">            lbl.text = 'fault';</span>
<span style="color: #339933;">        }</span>
<span style="color: #339933;">    ]]&gt;</span>
<span style="color: #339933;">    &lt;/mx:Script&gt;</span>
&nbsp;
    <span style="color: #000000;"><span style="color: #7400FF;">&lt;mx:Label</span> id=<span style="color: #ff0000;">&quot;lbl&quot;</span> text=<span style="color: #ff0000;">&quot;initial&quot;</span> <span style="color: #7400FF;">/&gt;</span></span>
<span style="color: #000000;"><span style="color: #7400FF;">&lt;/mx:Application</span><span style="color: #7400FF;">&gt;</span></span></pre></div></div>

<p>We make a call our service, and then use the callbacks to alter the UI however we want depending on the result.  In common usage, the fact that our service returns an <code>AsyncToken</code> is worthless, it might as well return <code>void</code>.  So, why did I say this is critical?  Throw Fluint testing into the mix and it&#8217;s &#8220;Show em what&#8217;s behind door number 2, Johnny!&#8221;</p>
<h5>Fluint Testing</h5>
<p class="bottom">Fluint provides two different async wrapper methods: <code>asyncHandler</code> and <code>asyncResponder</code>.  The first allows a test to be wired to an async method by events, the second allows a test to be wired to an async method by a responder.  Since the service method we&#8217;re trying to test doesn&#8217;t throw any events, we&#8217;ll need to use the latter.  So inside a Fluint test case, we have our test method:</p>

<div class="wp_syntax"><div class="code"><pre class="actionscript3" style="font-family:monospace;"><span style="color: #0033ff; font-weight: bold;">public</span> <span style="color: #339966; font-weight: bold;">function</span> testGetSomething<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><span style="color: #000066; font-weight: bold;">:</span><span style="color: #0033ff; font-weight: bold;">void</span> <span style="color: #000000;">&#123;</span>
    <span style="color: #009900; font-style: italic;">//call service with dummy callback</span>
    <span style="color: #6699cc; font-weight: bold;">var</span> token<span style="color: #000066; font-weight: bold;">:</span>AsyncToken = service<span style="color: #000066; font-weight: bold;">.</span>getSomething<span style="color: #000000;">&#40;</span>dummyResult<span style="color: #000066; font-weight: bold;">,</span> dummyFault<span style="color: #000000;">&#41;</span><span style="color: #000066; font-weight: bold;">;</span>
&nbsp;
    <span style="color: #009900; font-style: italic;">//create async test responder</span>
    <span style="color: #6699cc; font-weight: bold;">var</span> responder<span style="color: #000066; font-weight: bold;">:</span>IResponder = asyncResponder<span style="color: #000000;">&#40;</span>
            <span style="color: #0033ff; font-weight: bold;">new</span> TestResponder<span style="color: #000000;">&#40;</span>testHandler<span style="color: #000066; font-weight: bold;">,</span> faultHandler<span style="color: #000000;">&#41;</span><span style="color: #000066; font-weight: bold;">,</span> <span style="color: #000000; font-weight:bold;">1000</span><span style="color: #000066; font-weight: bold;">,</span> token<span style="color: #000000;">&#41;</span><span style="color: #000066; font-weight: bold;">;</span>
&nbsp;
    <span style="color: #009900; font-style: italic;">//wire test responder as 2nd callback</span>
    token<span style="color: #000066; font-weight: bold;">.</span>addResponder<span style="color: #000000;">&#40;</span>responder<span style="color: #000000;">&#41;</span><span style="color: #000066; font-weight: bold;">;</span>
<span style="color: #000000;">&#125;</span>
<span style="color: #0033ff; font-weight: bold;">private</span> <span style="color: #339966; font-weight: bold;">function</span> testHandler<span style="color: #000000;">&#40;</span>result<span style="color: #000066; font-weight: bold;">:</span><span style="color: #004993;">Object</span><span style="color: #000066; font-weight: bold;">,</span> passThroughData<span style="color: #000066; font-weight: bold;">:</span><span style="color: #004993;">Object</span><span style="color: #000000;">&#41;</span><span style="color: #000066; font-weight: bold;">:</span><span style="color: #0033ff; font-weight: bold;">void</span> <span style="color: #000000;">&#123;</span>
    assertEquals<span style="color: #000000;">&#40;</span><span style="color: #990000;">'something'</span><span style="color: #000066; font-weight: bold;">,</span> result<span style="color: #000066; font-weight: bold;">.</span>result<span style="color: #000066; font-weight: bold;">.</span><span style="color: #004993;">name</span><span style="color: #000000;">&#41;</span><span style="color: #000066; font-weight: bold;">;</span>
<span style="color: #000000;">&#125;</span></pre></div></div>

<p>The trick is to wire a second callback via Fluint&#8217;s <code>asyncResponder</code> helper that actually does the testing, and just give the original service call some dummy callbacks.  Note that if the service method didn&#8217;t return its <code>AsyncToken</code> there would be no way to wire a second callback.  The Fluint async helper do two import operations: they handle the event or call the callback AND they mark the test method as an async method so the result is correctly reported by the test harness.  You can read more about <a href="http://code.google.com/p/fluint/wiki/AsyncTest">Async Testing</a> in Fluint&#8217;s wiki.  The rest of Fluint is your standard chain of crap borrowed from JUnit: test runner, test suites, and test cases.</p>
<blockquote class="deeper"><p><b>Digging Deeper:</b> It is equally critical to use dummy callbacks in the original service method call because in a failure situation they will cause Flash Player to error out instead of being caught by Fluint and reported as a test failure.</p></blockquote>
<h5>Files</h5>
<p>The complete code is up on <a href="http://www.github.com/">GitHub</a> here: <a href="http://github.com/saturnboy/test_fluint_async/tree/master">test_fluint_async</a>.  The code is MIT licensed and includes a working fluint.swc (see below) plus a mock async backend (so timeouts and faults are easily testable).</p>
<p>Alas, Fluint v1.1.0 was built incorrectly and is missing the <code>TestResponder</code> class (see <a href="http://code.google.com/p/fluint/issues/detail?id=35&#038;can=1">issue 35</a>).  So if you want to try out Fluint in your project, I recommend you grab it from svn and build the swc yourself.  Hopefully, this will all be fixed in the next release.</p>
<p><strong>UPDATE:</strong> Fluint v1.1.1 was release on May 1, 2009 and fixes this issues and a few others.  Download it <a href="http://code.google.com/p/fluint/downloads/list">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://saturnboy.com/2009/03/fluint-async-testing/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

