<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
  <title>Kevin McFadden on scriptogr.am</title>
  <link>http://resullus.org</link>
  <description>The writings of an imperfect perfectionist</description>
  <pubDate>2013</pubDate>
 
  <item>
    <title>Learnings from Week 13 and 14 of 2012</title>
    <pubDate>Mon, 26 Mar 2012 00:00:00 -0400</pubDate>
    <link>http://resullus.org/post/learnings-from-week-13-and-14-of-2012</link>
    <guid>http://resullus.org/post/learnings-from-week-13-and-14-of-2012</guid>     
    <description><![CDATA[<h2>Learnings</h2>

<ul>
<li><a href="http://stackoverflow.com/a/8293786/144594">Problem using OpenStruct with ERB</a> - Ruby 1.9 ERB handles binding variables slightly differently.  Compare:

<ul>
<li>(1.9) <code>ERB.new(tmpl).result(vars.instance_eval {binding})</code></li>
<li>(1.8) <code>ERB.new(tmpl).result(vars.send(:binding))</code></li>
</ul></li>
<li><a href="https://github.com/marcel/aws-s3">marcel/aws-s3</a> - Excellent ruby library for interacting with Amazon's S3. </li>
<li><a href="http://speakerdeck.com/u/old_sound/p/messaging-patterns-with-rabbitmq">Messaging Patterns by Álvaro Videla</a>: Informative overview of RabbitMQ.</li>
</ul>

<h2>Processing Map Reduce Output by Line</h2>

<p>I needed to consume Amazon Elastic Map Reduce (EMR) output and turn it into a format compatible with mysqlimport -- pipe-delimited in my case.  What started out as a quick hack, turned into something kind of nice.  The key benefit of this script is that it will process your EMR result files by streaming directly from s3, saving you the hassle of copying, processing, and purging!  With a decent network connection (~800KB/s) I was able to parse 26, ~7.8MB files in a few minutes.</p>

<p>Note: the script probably requires MRI 1.9.3 compatibility.</p>

<script src="https://gist.github.com/2305909.js"> </script>

<h2>Definitions: RabbitMQ</h2>

<p>One of the hardest parts of learning a new domain is learning the new language.  By writing these up, I hope they will stick in my head longer...</p>

<ul>
<li>producer: creates messages 

<ul>
<li>messages are objects</li>
</ul></li>
<li>consumer: receives and processes messages

<ul>
<li>since messages are objects, it's up to you to do the right thing</li>
</ul></li>
<li>queue - holding cell for messages waiting to be processed.

<ul>
<li>FIFO</li>
<li>named, e.g., "test-queue"</li>
</ul></li>
<li>exchange: buffers items before adding them to a queue

<ul>
<li>You can have a named exchange, or use a default one specified by an empty string.</li>
<li>types

<ul>
<li>fanout: sends the message to all queues registered w/ the exchange</li>
<li>direct:  sends the message to a named queue via a routing key</li>
<li>topic: sends the message to named queues via a routing key that can bind to different queues based on their name and wildcards.</li>
</ul></li>
</ul></li>
</ul>
]]></description>
  </item>
 
  <item>
    <title>Learnings from Week 12 of 2012</title>
    <pubDate>Mon, 19 Mar 2012 00:00:00 -0400</pubDate>
    <link>http://resullus.org/post/learnings-from-week-12-of-2012</link>
    <guid>http://resullus.org/post/learnings-from-week-12-of-2012</guid>     
    <description><![CDATA[<ul>
<li><a href="http://www.cloudera.com/blog/2011/01/map-reduce-with-ruby-using-apache-hadoop/">Map-Reduce With Ruby Using Apache Hadoop</a> - Good place to start kicking the tires, but start about halfway down the page w/ the Ruby related content.</li>
<li><a href="http://help.papertrailapp.com/kb/analytics/log-analytics-with-hadoop-and-hive">Log analytics with Hadoop and Hive</a> - </li>
<li><a href="http://www.kickasslabs.com/2009/01/04/hadoop-streaming-for-rapid-prototyping-of-distributed-algorithms/">Hadoop Streaming for Rapid Prototyping of Distributed Algorithms</a> - Oldy, but goody.</li>
<li><a href="http://stackoverflow.com/questions/273262/best-practices-with-stdin-in-ruby">Best Practices with STDIN in Ruby?</a> - Use ARGF instead of STDIN because ARGF will handle both STDIN and named files.</li>
</ul>

<h3>Things I should remember by now</h3>

<ul>
<li><p>Sorting by a column in Unix AND specifying a tab as the column separator:</p>

<pre class="prettyprint"><code>cat /tmp/file | sort -t"`echo '\t'`" -k2n
</code></pre></li>
<li><p><a href="http://stackoverflow.com/questions/295781/shortest-command-to-calculate-the-sum-of-a-column-of-output-on-unix">Using awk to sum a column of numbers</a> - Again, with the tab separator.</p>

<pre class="prettyprint"><code>awk -F"`echo '\t'`" '{ sum += $2 } END { print sum }'
</code></pre></li>
</ul>

<h3>Installing Hadoop on OS X and Homebrew</h3>

<p>If you are using OS X and <a href="https://github.com/mxcl/homebrew/">Homebrew</a>, Hadoop can be installed with a simple:</p>

<pre class="prettyprint"><code>brew install hadoop
</code></pre>

<p>However, if you want to use a version compatible with AWS, specifically 0.20.205.0, you need to hack the brew formula.</p>

<p>Before:</p>

<pre class="prettyprint"><code>url 'http://www.apache.org/dyn/closer.cgi?path=hadoop/core/hadoop-1.0.1/hadoop-1.0.1.tar.gz'
md5 'e627d9b688c4de03cba8313bd0bba148'
</code></pre>

<p>After:</p>

<pre class="prettyprint"><code>url 'http://www.apache.org/dyn/closer.cgi?path=hadoop/core/hadoop-0.20.205.0/hadoop-0.20.205.0.tar.gz'
md5 '8016D8A2A50CB2BEB17F2F45A1EA28DA'
</code></pre>

<p>Last I checked, this was the only way to do it w/o forking the project.  Before running <code>brew update</code> you should remember to <code>cd /usr/local/ &amp;&amp; git stash</code>.  Afterwards, <code>git stash pop</code> to re-apply them.</p>

<p><strong>Note:</strong> If your map or reduce methods catch exceptions, make sure they don't hide problems.  You may end up with a successful run, but output is empty.</p>
]]></description>
  </item>
 
  <item>
    <title>Learnings from Week 10 of 2012</title>
    <pubDate>Mon, 12 Mar 2012 00:00:00 -0400</pubDate>
    <link>http://resullus.org/post/learnings-from-week-10-of-2012</link>
    <guid>http://resullus.org/post/learnings-from-week-10-of-2012</guid>     
    <description><![CDATA[<ul>
<li><p><a href="http://en.kioskea.net/forum/affich-73393-change-mac-admin-password-without-the-disk">Change Mac admin password without the disk</a> Very useful when employees leave and their password doesn't appear to work.</p></li>
<li><p><a href="http://dev.mysql.com/doc/refman/5.5/en/group-by-functions.html#function_group-concat">MySQL's Group Concat Function</a> I love this function! If you ever need to pull back a list of anything, e.g. table ids, this will put them all into one column, separated by a comma or whatever you specify.</p>

<pre class="prettyprint"><code>SELECT GROUP_CONCAT(id) FROM authors;
</code></pre></li>
</ul>
]]></description>
  </item>
 
  <item>
    <title>Useful Links from Week 9 of 2012</title>
    <pubDate>Mon, 05 Mar 2012 00:00:00 -0500</pubDate>
    <link>http://resullus.org/post/useful-links-from-week-9-of-2012</link>
    <guid>http://resullus.org/post/useful-links-from-week-9-of-2012</guid>     
    <description><![CDATA[<p>This is the inaugural post.  I hope to capture all of the truly useful links I referenced for work and play in the previous week.  This week will be a little short since I'm starting it on a Sunday…</p>

<ul>
<li><a href="http://aws.amazon.com/articles/1636185810492479">Best Practices in Evaluating Elastic Load Balancing</a> Great information now that I'm relying on AWS's elastic load balancers.</li>
<li><a href="http://ranjib.posterous.com/infrastructure-tooling-patterns">Infrastructure tooling patterns</a> A good reminder of the various sysadmin/devops pain points and the tools mitigate them.</li>
<li>Writing and documenting APIs:

<ul>
<li>Dropbox: <a href="https://www.dropbox.com/developers/reference/api">REST API</a> </li>
<li>Parse: <a href="http://blog.parse.com/2012/01/11/designing-great-api-docs/">Designing Great API Docs</a> </li>
</ul></li>
<li><a href="http://technicalpickles.com/posts/using-method_missing-and-respond_to-to-create-dynamic-methods">Using method_missing and respond_to? to create dynamic methods</a> method_missing, especially from deep in a long ancestor tree, can be slow.</li>
</ul>

<p>The most useful article for me was Jay Field's <a href="http://blog.jayfields.com/2008/04/alternatives-for-redefining-methods.html">Alternatives for Redefining Methods</a>.  More about why it was so useful once my code clears a third party sanity check!</p>
]]></description>
  </item>
 
  <item>
    <title>This is your blog, delivered by scriptogr.am</title>
    <pubDate>Sat, 31 Dec 2011 00:00:00 -0500</pubDate>
    <link>http://resullus.org/post/this-is-your-blog-delivered-by-scriptogr.am</link>
    <guid>http://resullus.org/post/this-is-your-blog-delivered-by-scriptogr.am</guid>     
    <description><![CDATA[<p>Thank you for using scriptogr.am. While we’re still in early beta development, we think you’ll enjoy the app. It’s designed to be fast, simple and to get the most creativity out of you.</p>

<p>scriptogr.am uses <a href="http://daringfireball.net/projects/markdown/" title="Markdown">Markdown</a>, a lightweight markup language, originally created by <a href="http://daringfireball.net/" title="Daring Fireball">John Gruber</a> and <a href="http://www.aaronsw.com/" title="Aaron Swartz">Aaron Swartz</a>. Markdown is a text-to-HTML conversion tool for web writers. Markdown allows you to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML). See the <a href="http://daringfireball.net/projects/markdown/syntax" title="Markdown syntax">Syntax</a> page for details pertaining to Markdown’s formatting syntax. You can try it out, right now, using the online <a href="http://daringfireball.net/projects/markdown/dingus" title="Dingus">Dingus</a>.</p>

<h1>Getting started</h1>

<p>After connecting your Dropbox account to scriptogr.am, some necessary files and folders are added to your Dropbox at <code>Apps/scriptogram</code>. First the <em>GET_STARTED.txt</em> text file that pretty much explains the exact same as what you’re reading now. Next, we’ve added a <code>posts</code> folder. This is where you add your blog <em>post</em> (&amp; <em>page</em>) files. These files are plain textfiles, but needs to be saved with the .md (markdown) extension like this: <em>yourfile.md</em></p>

<p><img src="http://dl.dropbox.com/u/35476/_scriptogram/folder.png" alt="" /></p>

<p>We’ve added a post example page (this file) there for you to get familiar with.</p>

<h2>The template data</h2>

<p>All files needs to contain "front block". The front block must be the first thing in the file and takes the form of:</p>

<pre class="prettyprint"><code>---
Date: 2012-04-17
Title: My first post
---
</code></pre>

<p>Between the triple-dashed lines, you can set any of the predefined variables (see below for a reference). But, the <em>Title</em> is required. Without the title, the system will fail.</p>

<h3>Predefined global variables</h3>

<p><em>All the variable names below are case-sensitive:</em></p>

<p><code>Required:</code></p>

<pre class="prettyprint"><code>Title
</code></pre>

<p>The title of your post (or page)</p>

<p><code>Not required, but close to:</code></p>

<pre class="prettyprint"><code>Date
</code></pre>

<p>The following date format is the correct one to use: <em>2001-12-18</em>.
(The <em>Date</em> variable can be used to ensure correct sorting of posts.)</p>

<p><code>Optional:</code></p>

<pre class="prettyprint"><code>Published
</code></pre>

<p>Set to ’false’ if you don’t want a post to show up when the site is generated.</p>

<pre class="prettyprint"><code>Type
</code></pre>

<p>Set to ’page’ if you wan’t the post to act as a ’page’ instead of a ’post’.</p>

<pre class="prettyprint"><code>Excerpt
</code></pre>

<p>Add an excerpt<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup> to your post or page.</p>

<h2>Difference between ’posts’ and ’pages’</h2>

<p>A <code>post</code> is a blog post.</p>

<p>A <code>page</code> is a similar as a <em>post</em>, but generates a link visible in the <em>menu</em> on your site that will lean to a page permalink.</p>

<h1>Publishing your posts</h1>

<p>This is simple. Just head to your admin panel and hit the ”Synchronize” button. When logged in to scriptogr.am and visiting your own page, you’ll see the scriptogr.am logotype symbol on the top right of the browser window. This is the link that leads to your admin panel.</p>

<h2>Published vs unpublished</h2>

<p>Total count of published and unpublished posts (&amp; pages) are visible next to the ”Synchronize” button. ”Unpublished” means that you either removed a post text file from your Dropbox or that something went wrong while trying to sync your Dropbox with scriptogr.am. Also, if you’ve set a post to be published with the <code>Published: false</code> variable.</p>

<p><strong>Finally,</strong> happy posting. If you have any questions, suggestions or thoughts just drop us an <a href="mailto:info@scriptogr.am">e-mail</a> at any time.</p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:1">
<p>An <a href="http://en.wikipedia.org/wiki/Excerpt" title="Excerpt on Wikipedia">excerpt</a> is a relatively small sample passage from a longer work, such as a book or article.&#160;<a href="#fnref:1" rev="footnote">&#8617;</a></p>
</li>

</ol>
</div>
]]></description>
  </item>
 
  <item>
    <title>Capturing Client IP Address in Apache when Using AWS Elastic Load Balancing</title>
    <pubDate>Sun, 17 Apr 2011 00:00:00 -0400</pubDate>
    <link>http://resullus.org/post/capturing-client-ip-address-in-apache-when-using-aws-elastic-load-balancing</link>
    <guid>http://resullus.org/post/capturing-client-ip-address-in-apache-when-using-aws-elastic-load-balancing</guid>     
    <description><![CDATA[<p>UPDATE: Minor rewrite for the new blog hosting.  Should have no incorrect info.</p>

<p>When using Amazon's Elastic Load Balancer, and probably any load balancer, you lose normal access to the requestor's IP address. ELB appends the missing IP address to the X-FORWARDED-FOR header, so if your application uses this information you will need to use this variable.  Header values are spoofable, but the solution is fairly simple.  This solution is presented for Apache HTTPD.</p>

<p><a href="http://en.wikipedia.org/wiki/X-Forwarded-For#X-Forwarded-For_for_Web_server_logs">X-FORWARDED-FOR</a> is an HTTP header field for recording the originating IP address as a browser request passes through HTTP proxy and load balancer servers.</p>

<p>After enabled mod_headers (this may no longer be necessary w/ SetEnvIf), add the following line to your site's Apache configuration:</p>

<pre class="prettyprint"><code>SetEnvIf X-FORWARDED-FOR (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s*$ UpstreamIpAddress=$1
</code></pre>

<p>This will set the UpstreamIpAddress variable with the last IP address in the list, which will always be the one accessing the load balancer.  Obviously, any proxy servers, firewalls, or other servers that change the IP address before reaching the load balancer would will conceal the true client IP address.</p>

<p>If the IP address is important to you, you'll also want to update your access logger format:</p>

<pre class="prettyprint"><code>LogFormat "%{UpstreamIpAddress}e %l %u %t \"%r\" %&gt;s %b \"%{Referer}i\" \"%{User-agent}i\" combined_using_upstream_ip_address
</code></pre>

<p>If you need this value in your application, you can look for it in the request environment variables:</p>

<pre class="prettyprint"><code>Ruby on Rails: request.env
PHP: $_SERVER
</code></pre>
]]></description>
  </item>
 
  <item>
    <title>Zed in 60 Seconds</title>
    <pubDate>Thu, 17 Jun 2010 00:00:00 -0400</pubDate>
    <link>http://resullus.org/post/zed-in-60-seconds</link>
    <guid>http://resullus.org/post/zed-in-60-seconds</guid>     
    <description><![CDATA[<p>UPDATED: While porting this article, it took me a while to remember why I wrote this.  In order to remember the inspiration, I've fixed the reference links.</p>

<p>In a world of Whys and Zeds, I'm sure most people would choose Why. Both are probably certifiable, but one is gruffly entertaining and the other is obliquely clever. I've never met either of them, but I think I'd like to work with Zed. He'll tell it like it is and require your A game. He's a breath of fresh air in a world of posers (which isn't saying Why was a poser, but there are more self promoters than actual rock stars.)</p>

<p>Anyone who thinks marketing isn't lying, isn't living in the real world. Marketing is about selling something to people who don't know they want it. Most products are crap -- it doesn't matter what the market is. Marketers need to use creative words and imagery to distinguish their product from the others floating in the cesspool. Have you ever notice that every product or company is a "market leader"? That's not even creative. Do you really think Bud is noticeably better than Miller? Neither are even contenders, except in the marketing world.</p>

<p>Marketing truth would be awesome, if only it worked! If you are totally truthful, you'll be admitting your faults, which is equivalent to marketing your competitors because they won't be telling their whole story. How many politicians tell the whole truth? How about CEOs of global corporations sitting before Congress? If you are only telling half the story, that's half way to lying, and I don't think most marketing comes close to speaking sooth. Caveat Emptor.</p>

<ul>
<li><a href="http://gilesbowkett.blogspot.com/2010/06/zed-marketing-isnt-lying-maybe-its-good.htm">Zed: Marketing Isn't Lying; Maybe It's Good You Didn't Get A Puppy</a></li>
<li><a href="http://oppugn.us/posts/1276150607.html">Assholes Code Like Assholes (and Like Giles)</a></li>
</ul>
]]></description>
  </item>
    
</channel>
</rss>