How Ruby Objects Are Implemented

I’m currently reading Pat Shaughnessy’s excellent book Ruby Under a Microscope, and these are notes that I’ve summarized from the chapters I’m currently going through. This post, and the next, are notes from Chapter 6, Objects and Classes. It can be a bit confusing to describe the content purely in words, but the book itself contains many helpful diagrams, so pick it up if you’re interested!

RObject

Every Ruby object is the combination of a class pointer and an array of instance variables.
- Pat Shaughnessy

A user-defined Ruby object is represented by a structure called an RObject, and is referred to by a pointer called VALUE.

Inside RObject, there is another structure called RBasic, which all Ruby values will have.

Aside from the RBasic structure, RObject also contains numiv, a count of how many instance variables the object has, ivptr, a pointer to an array of values of the instance variables, and iv_index_tbl, which is a pointer to a hash table stored in the object’s associated RClass structure that maps the name/identity of each instance variable to its position in the ivtpr array.

In this case, the RObject representing apple will have a numiv of 2, and its ivptr will be a pointer to an array containing the values red and sweet. The RObject representing orange will have a numiv of 1, and its ivtpr will be a pointer to an array containing just sour.

Both apple and orange will have a RBasic structure whose klass pointer references the same Fruit RClass structure.

Generic Objects (RString, RArray…)

Generic objects such as strings and arrays are represented by more specialized versions of RObject, called RString, RArray, etc. Their internal representations are more optimized for the kind of values they store. An example of this optimization is the presence of ary, which is an array of a certain fixed size. This array will be used to store the values of the instance variables in the structure itself if they fit, instead of allocating memory for and referencing an external array.

They also contain the RBasic structure.

RBasic

RBasic contains a few internally-used flags and a pointer to its associated class, called klass. Classes are represented by a RClass structure, which is discussed in the next post.

Simple Values

Simple values like (small) integers, nil, true and false do not have an associated RObject-like structure. Instead, their value is stored directly in VALUE itself. The identity of these values are indicated by different flags in VALUE (do not confuse the flags in VALUE with those in RBasic, they are different).

For example, if the FIXNUM_FLAG is 1, then Ruby knows to intepret the rest of VALUE as an integer value instead of a pointer address to its associated RObject (or RString etc.) structure.