Now, subclassing represents an “is a” relationship. This means that our OrderedList should be a List in every respect, but with some added behavior. The Liskov Substitution Principle is one formulation of this idea.

One way to think about the difference between core and the standard library is that core is written in C, while the standard library is written in Ruby. Core are the classes that are used the most, so they’re implemented in as low-level a fashion as possible. They’ll be in every single Ruby program, so might as well make them fast! The standard library only gets pulled in by bits and pieces; another way of thinking about the difference is that you need to require everything in the standard library, but nothing in core.

What do you think this code should do?

class List < Array
end
puts List.new.to_a.class

If you said “it prints Array,” you’d be right. This behavior really confuses me, though, because List is already an Array; in my mind, this operation shouldn’t suddenly change the class.

Why does this happen? Let’s check out the implementation of Array#to_a:

If the class is not an Array, (represented by rb_cArray), then we make a new array of the same length, call replace on it, and then return the new array. If this C scares you, here’s a direct port to pure Ruby:

ObjectSpace allows you to inspect all of the objects that exist in the system. Here’s the output:

That’s a lot of arrays! This kind of shortcut is generally worth it: 99.99% of the time, this code is perfect.

That last 0.01% is the problem. If you don’t know exactly how these classes operate at the C level, you’re gonna have a bad time. In this case, this behavior is odd enough that someone was kind enough to document it.

We get the length of the array, make a new blank array of the same length, then do some pointer stuff to copy everything over, and return the new copy. Unlike #to_a, this behavior is not currently documented.

Now: you could make the case that this behavior is expected, in both cases: after all, the point of the non-bang methods is to make a copy. However, there’s a difference to me between “make a new array with this stuff in it” and “make a new copy with this stuff in it”. Most of the time, I get the same class back, so I expect the same class back in these circumstances.

Let’s talk about a more pernicious issue: Strings.

As you know, the difference between interpolation and concatenation is that interpolation calls #to_s implicitly on the object it’s interpolating:

You can see with a string, the bytecode actually puts the final concatenated string. But with an object. it ends up calling tostring, and then concatstrings.

Again, 99% of the time, this is totally fine, and much faster. But if you don’t know this trivia, you’re going to get bit.

Here is an example from an older version of Rails. Yes, you might think “Hey idiot, there’s no way it will store your custom String class,” but the whole idea of subclassing is that it’s a drop-in replacement.

I know that there’s some case where Ruby will not call your own implementation of #initialize on a custom subclass of String, but I can’t find it right now. This is why this problem is so tricky: most of the time, things are fine, but then occasionally, something strange happens and you wonder what’s wrong. I don’t know about you, but my brain needs to focus on more important things than the details of the implementation.

Since I first wrote this post, James Edward Gray II helped me remember what this example is. One of the early exercises in http://exercism.io/ is based on making a DNA type, and then doing some substitution operations on it. Many people inherited from String when doing their answers, and while the simple case that passes the tests works, this case won’t: