Three RubyConf Surprises

This year RubyConf was an inspiring, enthusiastic and fun experience. If you weren't lucky enough to be able to attend in person, I would highly recommend taking a look at http://rubyconf13.multifaceted.io, a multimedia site about RubyConf put together by Ninefold, a Sydney Rails hosting firm. They collected an amazing set of speaker interviews, slides, talk summaries, and tweets from the conference. And just this week Confreaks has started posting the session videos online; there are many sessions I missed in person that I can’t wait to see.

Today, however, I’d like to pass along a few tidbits of technical knowledge I learned during the conference that surprised me. Aside from all the great people I met during hallway, dinner and lunch conversations, the surprising bits of knowledge are what stand out in my memory as I write this, three weeks after the conference ended.

Empirical Data Supporting the Weak Generational Hypothesis

If you’ve ever studied the computer science theory behind garbage collection, you’ve probably heard of the weak generational hypothesis. This is the assumption is that most objects die young, that your application will use most objects it creates for a relatively short time only. For example, suppose I create a new Ruby object:

obj = Object.new

Chances are, I will only use this object within the scope of a single method or as an intermediate value in some calculation. As soon as that method returns, the object is popped off the stack and its memory can be returned to the system. Conversely, my code will use only a few objects for a longer period of time. These objects are more likely to live a longer life.

The hypothesis allows language implementors to build garbage collection systems that handle new and old objects differently. For example, the JVM (used by JRuby along with many other languages) manages the memory for young objects very differently than for old, or mature, objects. This is known as generational garbage collection. Rubinius also uses this idea.

Computer scientists call this a hypothesis since there’s no easy way to prove it is true. We have to rely on empirical evidence. Of course, developers are free to create and use objects for whatever purpose they would like. The hypothesis simply describes how developers typically use objects in most applications.

During his excellent RubyConf talk, Object management on Ruby 2.1, Koichi Sasada (@_ko1) described how the upcoming release of MRI Ruby, version 2.1, will now also contain a generational garbage collection algorithm. But what most impressed me, what surprised me, was this slide:

Here you can see Koichi actually provides the empirical evidence behind the weak generational hypothesis. What he’s done here is measure the lifetimes of different Ruby objects created when running the RDoc utility. Notice that most of the objects created by RDoc die very quickly, after only one or two garbage collections. This is the spike at the left side of the chart. Meanwhile on the right side of the chart, a modest number of objects survive for much longer; using the legend at the bottom you can see these are often classes, modules and included classes (internal copies of modules).

The most interesting and surprising thing about this chart is that there are very few objects in the middle. After the initial spike on the left, there are almost no objects displayed at all until you reach about 70 GC runs, where a small spike occurs, and later at about 100 GC runs, where a larger set of mature objects were measured.

I found this experiment to be fascinating! It’s one thing to read about the weak generational hypothesis in CS textbooks; it’s quite another to see actual data. Looking at this chart, it’s easy to see why Koichi and the Ruby core team are working so hard to add generational GC to Ruby 2.1: once an object becomes mature, once it lives past the first or second garbage collection, there’s no need to worry about it again. According to the chart mature objects will likely live on for a very long time. In Ruby 2.1 the mark and sweep garbage collection code will focus only on the young objects, which appear at the left side of the chart.

How to Look Behind in a Regex

I was surprised again just a couple of hours after Koichi’s presentation. In the same room, Nell Shamrell (@nellshamrell) gave an amazing performance, teaching the lucky audience about regular expressions in Ruby. Nell’s theatrical training shined through; she was very confident speaking in public and it was clear she had rehearsed the presentation just as a Shakespearian actress would prepare for a role in Romeo and Juliet. She not only explained how to use regexes in Ruby, but even dove into how they are implemented under the hood inside of Ruby.

I thought I knew a lot about regular expressions, but Nell proved me wrong. She explained and used one detail of regex syntax I had never heard of before, known as look behind. Here’s a slide from Nell’s presentation where she uses this obscure but powerful feature.

This is part of a longer example Nell took us through which converts snake case to camel case (originally inspired by a tweet from Avdi). I won’t repeat all the details here, except to point out the confusing and unfamiliar syntax in the middle of the regex expression:

(?<=_)[a-z]

The [a-z] part is simple enough. This means: match any character between a and z or, in other words, any lowercase letter. But what about the (?<=_) part? It turns out this is a “look behind” clause. It means look behind the current potential match and return it as an actual match only if the preceding text satisfies the given regex clause. ?<= means “look behind” and (?<=_) means “look behind and only match things that are preceded by a _”.

For example, this regex matches the first lowercase letter in the target string, t:

str = "this is a long string"
puts /[a-z]/.match(str)
=> t

But this second regex only matches the single lowercase letter that follows an underscore, l:

How Many Method Calls to Display a Page using Rails?

The last surprise I’ll talk about today was discovered by Mark Bates during his work with the TracePoint feature. As Mark described, the new TracePoint API allows your Ruby code to be notified by the interpreter whenever a certain type of event occurs: calling a method, executing a line of code, defining a class or module, etc.

Mark showed the audience how to use TracePoint with a few interesting examples, including how to create abstract classes and interfaces as you would using Java. His usual entertaining self, Mark gave us an interesting peek into Ruby internals using the TracePoint API.

But one experiment Mark performed using TracePoint surprised me! He created a simple, “hello world” Rails web application containing a single controller, route and view and then measured how many method calls were issued by the Rails framework while generating the page. As Mark explained, this was fairly easy to do using TracePoint since you can have Ruby call you each time a method call occurs.

Of course, Mark asked the audience to guess how many method calls we thought Rails would require to show the page. People guessed 10,000, 20,000 or even 50,000. But no one expected the actual answer:

I found this to be astounding! How can it be possible so many method calls are required? Rails is a complex framework, but what could all of these method calls be used for? I wonder what these methods actually are, and how smaller frameworks such as Sinatra or even Rack would compare to Rails.

And So Much More!

These three surprises are just a tiny bit of the knowledge and conversation that was shared at RubyConf. It was one of the most fun and interesting conferences I’ve ever attended. Here are a few more photos from my own camera… be sure to check out the Confreak videos, as well as the content on http://rubyconf13.multifaceted.io.

See you next year in San Diego for RubyConf 2014!

12 year old Katie Hagerty not only gavea keynote address, she kept theaudience enthralled for 45 minutes!

In what has to be the most amazing demo I’ve ever seen, Ron Evans used Ruby to control flying robots, all captured on video!

Pat Shaughnessy writes a blog about Ruby development and recently self-published an eBook called Ruby Under a Microscope. When he's not at the keyboard, Pat enjoys spending time with his wife and two kids. Pat is also a fluent Spanish speaker and travels frequently to Spain to visit his wife's family.