Typing (the programming kind)

published:
Mon, 19-Dec-2005
|
updated: Mon, 19-Dec-2005

There was a time a couple of years ago when I thought generics were
the best thing since sliced bread. They still are and the ability of
the Anders and C# team to come up with something so amazing as
LINQ and
lambda expressions for a C# 3.0 preview built on this foundation is
pretty amazing. But, in a sense, all these new experimental features
gives us is the ability to fake dynamic typing on a static typing
foundation. And that leads me to dynamically-typed languages.

You see, last week,
Ruby on Rails 1.0 was
released. And I finished a book called
"Beyond Java" by
Bruce Tate. (This article is about neither, though.) And I got to
thinking about languages and language design and typing.

Think of it this way. When we talk about typing we're thinking along
two main axes. There's the strong typing versus weak typing axis, and
then there's the dynamic typing versus static typing axis.

C# and Java and Delphi for .NET are statically- and strongly-typed
languages. When you want to use a variable you have to declare that
variable first, and when you declare a variable, you give it a type.
From then on that variable's type is fixed and unchanging, all the way
to the end of its lifetime. You can sometimes refer to the variable by
an ancestor type, all the way to "object" (the base class, whatever it
may be called in your neck of the language woods), but it still
retains its declared type.

Easy-peasy stuff. The compiler does a lot of work making sure that you
use your variables in a consistent manner with their types (throwing
off errors, if you don't) and everything's hunky dory. You can't cast
a variable to a totally different type (say from a boolean to a
string) because the compiler won't let you. Type-safety means it's
harder to shoot yourself in the foot, but there are still myriad other
ways to do so.

Weakly-typed languages, on the other hand, are fraught with danger.
The best example of a language of this type is C. You can change the
type of a variable really easily, just coerce it to a pointer and then
the world's your oyster. This flexibility does give you great power
and expressivity, but, boy, pay attention or you can mess things up
drastically. Buffer overflows, trashing memory, double-deallocations,
the whole nine yards. C++ is a lot like this too, especially if you
forsake the safe type coercions. So is 32-bit Delphi. Essentially the
compiler assumes that you know what you're doing when you cast a Foo
as a Bar and merrily lets you do it.

All the language examples I gave (C, C++ and Delphi) can be viewed as
weakly- but statically-typed languages. You have to declare a
variable's type when you declare the variable but you can munge (a
technical term, that) the variable to one of another type through the
magic of non-typesafe casts.

The languages of interest these days are the dynamically-typed
languages, like Ruby,
Python and
Perl. Most people view them as
being weakly-typed (they conflate dynamically-typed with weakly-typed),
but in reality they're not.

Let's take Ruby since I'm in the process of
teaching myself it.
In Ruby you can write this:

x = 10
y = x * x
x = "hello"

Looking at this, people will say that this example definitely shows
that Ruby is weakly typed. Not true. It may be dynamically typed
(first x is a Fixnum, but after this further
point x is a String), but it is most
certainly not weakly typed. Once x is redefined it
remembers nothing of its previous existence.

Here's the result of multiplying the second instance of x
by itself (in case you didn't know, irb is the
interactive Ruby shell):

So as you can see Ruby is strongly typed: x is a
String and there is no way to "multiply" a string by
another, nor is there a conversion to cast or convert a string to an
integer to facilitate such a multiplication. So the statement fails,
bearing all the hallmarks of a strongly-typed language.

Yep, you just multiplied two pointers to a string together by
inappropriately casting them to integers. That's weak typing for you:
great for raw speed, but awful if you like walking on your feet.

Now, just because the first example shows some syntactically valid
Ruby code doesn't mean you should code like this, just as the Delphi
example above is not something you'd usually write. Give me a break.
That's not what it's all about. I kind of like not having to
explicitly declare that x is of this type or another. And
it's also true that dynamic typing means that you have to be careful
of misspellings, just in case.

However, I would venture that just because the compiler does a whole
lot of type-checking in strongly-typed languages it doesn't mean that
every syntactically correct application in such a language is correct.
No way. You still have to test the application, the assembly, the
module, or whatever. It's the same in Ruby as well. You can't just
write a whole bunch of dynamically typed code and hope that it works.
Nope: you test it.

If you're test-infected (that is, you always write unit test code for
your production code), you will naturally write Ruby code that can be
shown (that is, tested) to be correct. If you do write code like I did
in the irb session then running your test will show you that it is
wrong. Nothing very different than shipping code like this C# fragment
without testing it:

Here the first element of the list is actually a Foo, but
the ArrayList indexer was written to return a bare
object. The compiler knows nothing more than this, so it will allow a
cast to a Bar, even though flow analysis would show that
the assumption is wrong. The interesting thing is that the above
fragment is one justification for implementing generics in C# 2.0: to
improve the type information for the compiler so that you don't shoot
yourself in the foot. But, in general, good developers don't make this
kind of mistake in the first place.

Ah, the mythical good developer. I've heard the argument made that
developers, in general, are not "good developers" and therefore they
need all the support they can get from the compiler, hence strong-
typing will avoid lots of mistakes. Yet these same developers are the
ones that don't run
FxCop as a rule
either: the compiler can't catch everything. A bad developer is going
to do the minimum needed to get the job done (or at least the
"nominal" job done) and whether the language he uses is strongly-typed
or not will make not the blindest bit of difference.

If you, like me, prefer testing your code thoroughly, dynamically
typed languages should hold no fear for you. You already know that
testing is the best way of proving your application to be correct: the
compiler only catches some obvious mistakes. But in catching these
mistakes you must run it, and let it work out the dependencies and
track the variables and report back with its (hopefully empty) list of
warnings and errors. Now imagine that you don't have to run a compiler
in order to run your tests (dynamically-typed languages tend to be
interpreted): how much time would you save?

Now, before you launch your email app to blast me to smithereens, rest
assured that languages like Ruby are not going to solve every problem.
But, heck, neither are languages like assembly, C, C++, Java, Delphi,
Haskell, C#, VB, Smalltalk. At the moment, for attaching a database to
a web browser application, Ruby on Rails seems the way to go. For
other types of application, other languages may be better. (And
"better" in what sense? Faster to write and test? More speed of
execution? Less memory footprint? More able to utilize the L1 cache?
All of the above? Which business driver is driving you? Don't be
pushed by business drivers that aren't important.)

I've been dabbling in Ruby too little over too long a timeframe. Time
to twist the knob to the max and do some real work with it to discover
what it can be like to write Ruby code. Stay tuned.