Fiddling with Ruby’s Fiddle

I enjoy playing around with Ruby’s internals so I can see how things really work under the hood. One of the great GREAT (did I mention great?) features of ruby is the Fiddle core extension. This allows you to use Ruby to really dig into the internals of Ruby object structures, which means you can use irb to do some super cool stuff. I have to thank Aaron Patterson (@tenderlove) for giving an excellent talk at Ruby meet up in Vancouver and showing us some pretty cool stuff with Fiddle, which gave me the inspiration to play around with it.

While playing around with Fiddle, one of the things I learned was you can get the actual pointer value of an object by taking the object id, and doing a bitwise shift to the left. This will give you the pointer (or memory location) of the ruby object in memory.

Given that we can get actual pointer values ,Fiddle gives us some nice ways of using pointers so that we can peek inside the raw data of an object. One of the classes we will use is Fiddle::Pointer. We can show that the above is true by doing this:

Now that we have an actual pointer object, we can grab blocks of memory and examine the data. If you take a look at the C struct definition for RString, you will notice that (like every ruby object), it begins with a struct RBasic. RBasic contains flags and also a pointer the class of the object (which of course is an object itself). We can get both the flags, and the pointer value to the class by doing:

The above code essentially takes the data that fills the size of two longs (since a VALUE is an unsigned long), and unpacks this data as two longs. The first value is the flags (which I’m ignoring), and the second is the pointer to the class object, which predictably we’ve shown to be a String.

If you take another look at RString, you’ll see that the next block of data is a union. Depending on the size of the string, this union either represents another struct that contains the length of the string, and a pointer to the string value itself, or an embedded character array containing the string value. What I’ve learned while using Fiddle with strings is that strings of a low length are stored directly within the string object itself and its size calculated according. Let’s test this: