The worst mistake of computer science

August 31, 2015Paul Draper87 Comments

Uglier than a Windows backslash, odder than ===, more common than PHP, more unfortunate than CORS, more disappointing than Java generics, more inconsistent than XMLHttpRequest, more confusing than a C preprocessor, flakier than MongoDB, and more regrettable than UTF-16, the worst mistake in computer science was introduced in 1965.

I call it my billion-dollar mistake…At that time, I was designing the first comprehensive type system for references in an object-oriented language. My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
– Tony Hoare, inventor of ALGOL W.

In commemoration of the 50th anniversary of Sir Hoare’s null, this article explains what null is, why it is so terrible, and how to avoid it.

What is wrong with NULL?

The short answer: NULL is a value that is not a value. And that’s a problem.

It has festered in the most popular languages of all time and is now known by many names: NULL, nil, null, None, Nothing, Nil, nullptr. Each language has its own nuances.

Some of the problems caused by NULL apply only to a particular language, while others are universal; a few are simply different facets of a single issue.

NULL…

subverts types

is sloppy

is a special case

makes poor APIs

exacerbates poor language decisions

is difficult to debug

is non-composable

1. NULL subverts types

Statically typed languages check the uses of types in the program without actually executing, providing certain guarantees about program behavior.

For example, in Java, if I write x.toUppercase(), the compiler will inspect the type of x. If x is known to be a String, the type check succeeds; if x is known to be a Socket, the type check fails.

Static type checking is a powerful aid in writing large, complex software. But for Java, these wonderful compile-time checks suffer from a fatal flaw: any reference can be null, and calling a method on null produces a NullPointerException. Thus,

toUppercase() can be safely called on any String…unless the String is null.

read() can be called on any InputStream…unless the InputStream is null.

toString() can be called on any Object…unless the Object is null.

Java is not the only culprit; many other type systems have the same flaw, including of course, AGOL W.

In these languges, NULL is above type checks. It slips through them silently, waiting for runtime, to finally burst free in a shower of errors. NULL is the nothing that is simultaneously everything.

2. NULL is sloppy

There are many times when it doesn’t make sense to have a null. Unfortunately, if the language permits anything to be null, well, anything can be null.

Java programmers risk carpal tunnel from writing

if (str == null || str.equals("")) {
}

It’s such a common idiom that C# adds String.IsNullOrEmpty

if (string.IsNullOrEmpty(str)) {
}

Abhorrent.

Every time you write code that conflates null strings and empty strings, the Guava team weeps.
– Google Guava

Well said. But when your type system (e.g. Java, or C#) allows NULL everywhere, you cannot reliably exclude the possibility of NULL, and it’s nearly inevitable it will wind up conflated somewhere.

The ubiquitous possibility of null posed such a problem that Java 8 added the @NonNull annotation to try to retroactively fix this flaw in its type system.

3. NULL is a special-case

Given that NULL functions as a value that is not a value, NULL naturally becomes the subject of various forms of special treatment.

Pointers

For example, consider this C++:

char c = 'A';
char *myChar = &c;
std::cout << *myChar << std::endl;

myChar is a char *, meaning that it is a pointer—i.e. the memory address—to a char. The compiler verifies this. Therefore, the following is invalid:

Since 123 is not guaranteed to be the address of a char, compilation fails. However, if we change the number to 0 (which is NULL in C++), the compiler passes it:

char *myChar = 0;
std::cout << *myChar << std::endl; // runtime error

As with 123, NULL is not actually the address of a char. Yet this time the compiler permits it, because 0 (NULL) is a special case.

Strings

Yet another special case happens with C’s null-terminated strings. This is a bit different than the other examples, as there are no pointers or references. But the idea of a value that is not a value is still present, in the form of a char that is not a char.

A C-string is a sequence of bytes, whose end is marked by the NUL (0) byte.

Thus, each character of a C-string can be any of the possible 256 bytes, except 0 (the NUL character). Not only does this make string length a linear-time operation; even worse, it means that C-strings cannot be used for ASCII or extended ASCII. Instead, they can only be used for the unusual ASCIIZ.

This exception for a singular NUL character has caused innumerable errors: API weirdness, security vulnerabilities, and buffer overflows.

Double trouble

JavaScript has this same issue, but with every single object.
If a property of an object doesn’t exist, JS returns a value to indicate the absence. The designers of JavaScript could have chosen this value to be null.

But instead they worried about cases where the property exists and is set to the value null. In a stroke of ungenius, JavaScript added undefined to distinguish a null property from a non-existent one.

But what if the property exists, and is set to the value undefined? Oddly, JavaScript stops here, and there is no uberundefined.

Thus JavaScript wound up with not only one, but two forms of NULL.

5. NULL exacerbates poor language decisions

Java silently converts between reference and primitive types. Add in null, and things get even weirder.

For example, this does not compile:

int x = null; // compile error

This does compile:

Integer i = null;
int x = i; // runtime error

though it throws a NullPointerException when run.

It’s bad enough that member methods can be called on null; it’s even worse when you never even see the method being called.

6. NULL is difficult to debug

C++ is a great example of how troublesome NULL can be. Calling member functions on a NULL pointer won’t necessarily crash the program. It’s much worse: it might crash the program.

When I compile this with gcc, the first call succeeds; the second call fails.

Why? foo->bar() is known at compile-time, so the compiler avoids a runtime vtable lookup, and transforms it to a static call like Foo_bar(foo), with this as the first argument. Since bar doesn’t dereference that NULL pointer, it succeeds. But baz does, which causes a segmentation fault.

But suppose instead we had made bar virtual. This means that its implementation may be overridden by a subclass.

...
virtual void bar() {
...

As a virtual function, foo->bar() does a vtable lookup for the runtime type of foo, in case bar() has been overridden. Since foo is NULL, the program now crashes at foo->bar() instead, all because we made a function virtual.

int main() {
Foo *foo = NULL;
foo->bar(); // crash
foo->baz();
}

NULL has made debugging this code extraordinarily difficult and unintuitive for the programmer of main.

Granted, dereferencing NULL is undefined by the C++ standard, so technically we shouldn’t be surprised by whatever happened. Still, this is a non-pathological, common, very simple, real-world example of one of the many ways NULL can be capricious in practice.

7. NULL is non-composable

Programming languages are built around composability: the ability to apply one abstraction to another abstraction. This is perhaps the single most important feature of any language, library, framework, paradigm, API, or design pattern: the ability to be used orthogonally with other features.

In fact, composibility is really the fundamental issue behind many of these problems. For example, the Store API returning nil for non-existant values was not composable with storing nil for non-existant phone numbers.

C# addresses some problems of NULL with Nullable<T>. You can include the optionality (nullability) in the type.

But it suffers from a critical flaw that Nullable<T> cannot apply to any T. It can only apply to non-nullable T. For example, it doesn’t make the Store problem any better.

string is nullable to begin with; you cannot make a non-nullable string

Even if string were non-nullable, thus making string? possible, you still wouldn’t be able to disambiguate the situation. There isn’t a string??

The solution

NULL has become so pervasive that many just assume that it’s necessary. We’ve had it for so long in so many low- and high-level languages, it seems essential, like integer arithmetic or I/O.

Not so! You can have an entire programming language without NULL. The problem with NULL is that it is a non-value value, a sentinel, a special case that was lumped in with everything else.

Instead, we need an entity that contains information about (1) whether it contains a value and (2) the contained value, if it exists. And it should be able to “contain” any type. This is the idea of Haskell’s Maybe, Java’s Optional, Swift’s Optional, etc.

For example, in Scala, Some[T] holds a value of type T. None holds no value. These are the two subtypes of Option[T], which may or may not hold a value.

The reader unfamiliar with Maybes/Options may think we have substituted one form of absence (NULL) for another form of absence (None). But there is a difference — subtle, but crucially important.

In a statically typed language, you cannot bypass the type system by substituting a None for any value. A None can only be used where we expected an Option. Optionality is explicitly represented in the type.

And in dynamically typed languages, you cannot confuse the usage of Maybes/Options and the contained values.

Let’s revisit the earlier Store, but this time using ruby-possibly. The Store class returns Some with the value if it exists, and a None if it does not. And for phone numbers, Some is for a phone number, and None is for no phone number. Thus there are two levels of existence/non-existence: the outer Maybe indicates presence in the Store; the inner Maybe indicates the presence of the phone number for that name. We have successfully composed the Maybes, something we could not do with nil.

Not only is this functional way more succinct, but it is also a little safer. Remember that option.get() will produce an error if the value is not present. In the earlier example, the get() was guarded by an if. In this example, ifPresent() obviates our need for get() at all. It makes there obviously be no bug, rather than no obvious bugs.

Options can be thought about as a collection with a max size of 1. For example, we can double the value if it exists, or leave it empty otherwise.

option.map(x -> 2 * x)

We can optionally perform an operation that returns an optional value, and “flatten” the result.

option.flatMap(x -> methodReturningOptional(x))

We can provide a default value if none exists:

option.orElseGet(5)

In summary, the real value of Maybe/Option is

reducing unsafe assumptions about what values “exist” and which do not

When is NULL okay

It deserves mentions that a special value of the same size, like 0 or NULL can be useful when cutting CPU cycles, trading code quality for performance. This is handy for those low-level languages, like C, when it really matters, but it really should be left there.

The REAL problem

The more general issue of NULL is that of sentinel values: values that are treated the same as others, but which have entirely different semantics. Returning either an integer index or the integer -1 from indexOf is a good example. NUL-terminated strings is another. This post focuses mostly on NULL, given its ubiquity and real-world effects, but just as Sauron is a mere servent of Morgoth, so too is NULL a mere manifestation of the underlying problem of sentinels.

Interested in learning more about our software? Sign up for Lucidchart free here. Want to join our team? Check out our careers page.

87 Comments

It sounds like the problem is not with NULL But with the C/C++/Java-style implementation of it.

For example, Common Lisp has NIL (not “nil”), which you call CL’s NULL value, but it’s very different from the similarly named C/C++/Java concept. #3 and #5 don’t even make sense in CL. I don’t think #6 or #7 apply, either. #1 and #2 don’t really mean much, either, since CL doesn’t make any promises about types, unless you explicitly assert them (in which case you can easily assert non-NIL-ness, too).

What’s left? “NULL makes poor APIs.” Hmm, maybe. As a Lisp programmer, though, it doesn’t seem so bad, since none of those other problems really exist for us. Besides, we have multiple return values, which helps avoid many of the API problems you cite. Just as NULL “exacerbates poor language decisions”, its trouble can be muted by *good* language design decisions.

In philosophical terms, NULL is a weird case, but in practical terms, it’s not really a problem for me (as long as I stay outside of the C/C++/Java world). On a daily basis, I’m much more troubled that, say, exception objects aren’t class instances, or that defstruct pollutes my namespace.

doesn’t do what you think it does. In fact, I don’t think that second clause can ever be true b/c it’s testing if it’s the same object as the empty string object you’re just now instantiating inline. Needs to be:

if (str == null || str.equals(“”)) {

}

But the “str == null” does what you want! Which I guess proves the main point about null muddying the waters yet again. (In Java, null works for physical equality in places where you’re trying to use equality by value.)

Great article, but by your standards, Haskell should also be four stars, because it has Foreign.Ptr.nullPtr, which is basically like Rust’s std::ptr::null, and basically just used for FFI bindings. So either Rust should be 5 stars, or Haskell should be 4.

I can’t say for certain about the other languages, but the Python “None” value is NOT the same as null. None has a value (None), it has a type (NoneType), it can be introspected, it can be composed, and so on.

Of course, it doesn’t do very much, but it’s semantically similar to lots of other small classes that don’t do much.

I think you’re being too harsh on Objective-C there. nil, Nil, and NULL are all the same null (differentiated solely for contextual readability), and NSNull isn’t a null at all. Messages to nil/Nil/NULL are no-ops if they don’t return anything, and return nil/Nil/NULL, 0, 0.0(f) or zeroed structs if they do.

Behaviorally, NSNull is more similar to classical trapping nulls (you get a runtime exception trying to message it anything it doesn’t respond to, which is better defined than in C), but it’s used where a nil/Nil/NULL can’t be used. In the Store example above, Bob’s phone number could be set to [NSNull null] rather than nil. And it is composable in the other direction that you can have an NSNull property whose value is nil. Given that Objective-C is a dynamic language that barely pretends to be static (replace all your object pointer types by id, and everything will still compile and run fine, with maybe a few more warnings), that’s pretty good.

Peter: you’re wrong about that string comparison. String literals in java are automatically interned, so eg “”==”” will work – you don’t get two separate instances. str==”” will work if str is an interned string (ie it was explicitly interned, or was assigned from a literal). JLS 3.10.5:

Moreover, a string literal always refers to the same instance of class String. This is because string literals – or, more generally, strings that are the values of constant expressions (§15.28) – are “interned” so as to share unique instances, using the method String.intern.

The problem is that null will never go away as an implementation detail, at least in my world of high performance coding. In C/C++, there’s no good way to implement optional types without RTTI (which is a terrible idea) or carrying around extra state in a tagged union or special type. With null pointers, an address is the same size as a register and can be tested with simple instructions. The need for simplicity only gets worse as you move down to assembly.

Although that being said, we still use things that look awfully like optional types in high level APIs. In a project I’m working on, expected errors (we dont use exceptions) get returned in a struct that looks like this:

FYI NULL in “C” is not a value but a character it’s from the ASCII character set. When you hit the “NULL” key the terminal sends:
START BIT
THE VALUE 0
STOP BIT

In comparison if you hit the “ZERO” key the terminal sends:
START BIT
THE VALUE 48
STOP BIT

Because the NULL character has a Boolean value of false it is very easy to detect the end of the string:
if(!string(x)) // indicate end of string

also CR(\r) and LF(\n) are often used as string terminator they are not. They can be used anywhere withing a string they are simply control for the display or printer to move the cursor. { note: the \n should read “next line” and not “new line”}

In “C” the compiler use the “;” as the end of string , this allows multiple string: per physical line:
ex- a=1; b=3; c=a+b; if(c==3) c=0;

or you can have a string spanning many physical line:
ex- typedef structure NewType
{
int alpha, beta,
long foo,
floating bar
};

bottom line is it’s up to the developer to decide what he will use as an end of line/string character.

I wonder if behind the scenes the option concept is implemented in is_some (or whatever) by testing the this pointer against null?

Two points of substance:

1. The repetitive stress disorder problem is not solved. We still have to test against is_some to decide how to process an instance.

2. The phone book illustration suffers from another implementation error (of which null is often used in special-case work-arounds) and that is to INTERPRET AN ASSIGNED VALUE AS A RESULT CODE. This is a far more frequent abuse of type safety than Hoare’s NULL, and in any non-trivial environment makes exception handling a nightmare. In most of the languages listed, proper practice is to pass the target by reference (i.e. as an OUT parameter) and return a specific result code. A failed method should never modify the caller’s context.

Yves:
Not exactly. “NUL” (one L) is the ASCII character, while “NULL” (two Ls) isn’t actually defined IN C (at least not in C89 — I haven’t tracked it lately), but is general defined by implementations as a pointer (16- or 32-bits) with the binary value of zero.

A vitally important part of nearly any programming language. While computer programs normally think in terms of particular values, sometimes you need to express the lack of a value.
This sort of effect can be kludged via magic numbers like -1 or other arbitary values, but it’s better if your language of choice does it internally, eliminating any possible ambiguity.

I agree completely with that. My point is that NULL is also such a “kludgy” value, that causes semantics to contradict types (whether dynamic or static). Maybes/Options are the “non-kludgy” solution to optionality.

The last is till needed, but needs to be a different syntax that forces the programmer to write something for “If it is NULL then …”

The better solution(s) are

1) To have Non Nullable types as primitive and add the ability to overload as you suggest

2) To have Nullable types and and add the ability to overload for a Non Nullable

Sadly in most languages the primitive types do not have a NULL, for a very good reason, and do not allow the needed overloading of primitive types (for no good reason that I can see).

I could also get into the argument of whether type safety is always a productivity gain or not. My option/experience is that type safety is useful sometimes and is not useful sometimes. I get tired of type safety being intrinsic to a language/compiler and assumed by people to always be a win.

my biggest c# pain is when you use entity framework to retrieve a single object from the datastore. Consider this

var x = myentity.firstordefault(x => x.key = someval);

now, after this is called you have to do a
if(x != null){
}
every single time
otherwise any call to check a property causes null exceptions. If you accidentally miss this, boom, your program crashes.While the c# 6 ? notation will help its still not a great way to deal with it. Call me naive, but life would be easier if there was a fallback where if the lambda doesn’t match any records,then return an object that gets its values from the default constructor. If a string is null, then default the string to be string.empty to avoid a null exception when you try to trim it.

Interesting article, with a good overview of lot of popular languages…

You hadn’t stressed out that most modern languages, like Rust, Ceylon, Kotlin, are aware of the issue and try, in various ways, to address it.

Rust eliminated null entirely, although I suppose it has to deal with it when interoperating with C; I guess it is restricted in an unsafe layer.
JVM or JavaScript based languages have to deal with it, for interoperability reasons.
But null isn’t so bad if it is tamed, it can still be used to signal absence of value.
Ceylon solves the issue quite elegantly, putting it in the Null type, and forcing to declare explicitly if a type is nullable, by using union type:Integer cannot be null, and the compiler will complain if you try to put a null there; and variables must be explicitly initialized.
Now, you can declare a return type, a parameter type or a variable as Integer | Null (abbreviated as Integer? since it is a common case, and it shows clearly the optional nature of the value), allowing to set it to null to mark absence of value (eg. when parsing an incorrect string value; better than throwing an exception, too verbose and performance hit).
The nice touch is that the compiler is aware of such type, and forces you to test if a value is not null before allowing to use it. So, no NPE for Ceylon!
Even nicer, only one test is needed in the code path following the test: the compiler knows the value is not null and therefore won’t force you to do the check on each usage.
I find this solution very elegant, and since it is built in the language from the start, there is no risk of forgetting to wrap a value in an Option type…

In addition to seconding Dan D’s comment about Objective-C not being as bad as it looks, I’d like to add that the coverage of Swift is still a little bit confused.

The ! (unwrapping) operator is basically equivalent to the .get() method of Java 8’s Optional, only more compact and usable as an lvalue. unsafeUnwrap() is equivalent to ! except that it bypasses the actual nil check and simply assumes it has been passed an Optional.Some; you would only use it in extremely performance-sensitive code. (Hence the “unsafe” in the name.) Both of these are used with the Optional type (and its cousin, the slightly more dangerous ImplicitlyUnwrappedOptional).

The various species of UnsafePointer can also be nil, but this has nothing to do with unwrapping, and UnsafePointer is a low-level feature that shouldn’t be used very often, only a step or two above the FFI stuff you excuse in other languages. They also allow pointer arithmetic, accessing unallocated memory, leaking allocated memory, and other dangerous shenanigans. Like unsafeUnwrap(), they’re explicitly marked “unsafe” for a good reason.

Null, as you describe it, is a value that is not a value. So the C pointer value NULL[1] is not a pointer, and that’s a good example of what you’re on about.

But a char is not a pointer. A char is basically an integer type[2] of a particular size. For char values from 0 to 31, the values have names in the ASCII set. The value 0 is called nul – and it should be spelt so, with one ‘l’. But it’s just one normal value out of the set of values that a char can have. The originators of the C runtime library chose the character value 0 as the sentinel value for strings. They could have chosen any other value (but 0 was the best choice for many reasons). But the nul char is definitely not “a value that is not a value”. It’s just a value.

Flick

[1] For pointers: NULL and 0 may be interchangeable as pointer values in the C source code, but the compiler will have to do some fancy footwork if the address 0 could actually be a valid pointer value in a normal program… fortunately, in most architectures, it’s used by the hardware for vectors or whatever, and cannot be a valid address for data. So, yes, the value 0 is a null value for a pointer to char. But for the char itself, 0 is just another value. Pointers are not chars, and chars are not pointers.

[2] and yes, I know that people sometimes have to put the (non-zero) integer value of an address into a pointer. But integer types are not pointers. it’s just that, on most architectures, such casting is possible.

I enjoyed reading this, perhaps because I agree with the position. I would to love him to give UTF-16 the same send-up as NULL. I’m skeptical that I would agree on that one, but it would be a good read nonetheless.

We could write examples in any language that would fail in some way, and testing a pointer before using it is C/C++ 101 – no self respecting developer I know would admit to writing such poor code. Further – I would actually want an exception to be thrown in that case so that I could identify the root cause of misbehaving software which is clearly defective code.

C++ is a powerful language because of its versatility, which means its not for everyone. For everyone else there’s VBA.

The BIGGEST mistake in CS is the modern movement of assuming it’s a good idea to automate memory management to accommodate mediocre computer science graduates who have never acquired the discipline to manually manage memory but instead rely on expensive and problem-prone devices like automatic garbage collection and automatic reference counting. The current spate of high-level languages that attempt to hide how the von-Neumann architecture actually works might make it easier for novices to enter, but ends up making everyone jump through ridiculously unnecessary hoops when trying to leverage more efficient binaries written in “legacy” C or C++ simply because NO ONE EVEN KNOWS WHAT A POINTER IS ANYMORE. A zero is a zero, whether you call it a NULL, a nil, or a nullptr. If you can’t keep track of a pointer, you have no business calling yourself a programmer.

[…] The worst mistake of computer science at http://www.lucidchart.com Uglier than a Windows backslash, odder than ===, more common than PHP, more unfortunate than CORS, more disappointing than Java generics, more inconsistent than XMLHttpRequest, more confusing than a C preprocessor, flakier than MongoDB, and more regrettable than UTF-16, the worst mistake in computer science was introduced in 1965. […]

[…] was a blog post a while ago that declared that NULL was the worst ever mistake in history of computer science. The fact that nulls existed meant that code always has to be written to handle them, otherwise […]

You should learn to read more carefully. Tony Hoare did not call “null” a mistake, he called *null references* a mistake. Null as a concept predates ALGOL W and is quite useful when called for, that is, when you need a value that is not a value. This much is clearly demonstrated by type systems that provide for optional values. Speaking of which, you’ve left out C#’s Nullable on your chart.

The NULL reference problem is an instance of a more general problem: static types cannot adequately capture all types we really want to use.

Our computations can be regarded as functions, taking values and producing values.
Static typing is supposed to guarantee that functions aren’t called on invalid values.
Ideally, for every function we ever use, we’d have a type to for both its domain and its range.

But functions aren’t always surjective or injective. The NULL problem is the case where a function produces an object of a certain type but not always – hence, it can produce NULL – and another function taking the result. Option/Maybe elegantly addresses that. But there are other cases. Many operations on numbers, for instance, are undefined in some cases (e.g. division by zero). So in programs that divide, ws should really have a “nonzero integer” or “nonzero floating point number” type. But overflow cannot be dealt with in that way. So it is not possible to completely eliminate the problem in general.

Julia is another new language, which does not have an overarching NULL.
It does have a type, `Void`, whose only possible value is `Void()` (also called `nothing`)
You can also create things of the `Nullable{T}` type, which can hold a value of type T, or nothing,
in a type-stable fashion.

C# having Nullable is kind of irrelevant to its score, because it does nothing to save you from the worst thing about c# – the Null Reference Exception – because the type system has no way of expressing (or proving) that a reference cannot be NULL.

You do get NULL propagation, but again the compiler can’t enforce that you use it, or that you use it correctly, or whether or not an expression using it will evaluate to a non NULL value.

[…] Option/Maybe type, for example, is the alternative to nullable references: something that’s been heavily criticized for making programs prone to unexpected NullReferenceExceptions. The idea behind both of those […]

[…] Now let’s get back to those error messages you are probably still seeing. When you call readline() there is a chance that it will return null as the value. Essentially in programming null means there is nothing there, and so if you attempt to do an operation on a null value your program will crash. The idea of null has been referenced as the billion dollar mistake (because null pointer exceptions cost companies a lot of money of money), you can read more about why null was a bad idea here. […]

Ironically, while documents for Go decry the use of -1 as an error indicator in C, the recommended use of panic() / recover() is to check if the panic value is nil. So, if an error that leads to a panic also fails to set the panic value…
To fix:
func f() {
var nopanic = false
defer func () {
if !nopanic { panic_value = recover() ….}()
do stuff that might panic
nopanic = true
return = return_value
}

Just be certain that return_value is not a computation that can panic!

—

The intersection of database NULL and language nil is a rich source of surprising behavior. Just don’t.

The example on Ruby is unfair. Certainly, bad programmers will do as indicated, but Hash#has_key? is the way to check that a key exists. But also be aware that Hash.new permits the creation of default values, and also the use of a block to compute the value of a missing key. Ruby does not give you enough rope to hang yourself–it sets you down by yourself in the middle of a rope warehouse.

Hilariously missing the point. Recommend using Optional while completely missing that the Option class stores a null value internally. Optional wouldn’t be possible without nulls, unless you hackily exploit an empty list (which is still very deep down a null terminated array). Safely wrapping nulls doesn’t remove them from existence, it just makes it not your problem.

I’m all for better null safety, much in the way Kotlin and Swift provides, but they still HAVE nulls. Without nulls, optional fields are impossible to represent via object oriented programming. Even wrapping your objects in an Optional class still relies on the existence of the null value internally…

I really don’t get why people hate null so much. Just get better at writing unit tests, or use one of the many languages that provide better null safety.

In well-righten-idiomatic C++ NULL is really rare, just and only just when we need dynamic allocation. Because objects are by default in the stack static allocated, and references could not reference to null. And of course, you could and should use Optional, or smart-pointers with good exceptions for deference nulls (not standard ones)

Nulls are perfectly fine and asking for them to be removed from a language or have a language designed such that it does not have the concept of null is both ridiculous and impossible. This is because all languages that “don’t” have a null have the Option or Maybe type and they are just another incarnation of NULL. To put it another way, There is no difference between the Option/Maybe monad and NULL!

Additionally, maybe is a subclass of Either, specifically Either. A Maybe instance can Either be a value OR it can be null. All variables in the “{}” type languages (C,Java, etc.) don’t have the concept of a pure reference, instead every variable is a maybe. The reference can either be an object/value OR it can be null.

So please, don’t hate null. Null is extremely important. Hate the people that don’t understand that when you declare and int*, you are really declaring Either. If your compiler understood this distinction then you would never get a NullPointerException.

Sir C.A.R Hoare, a pioneer in Computer Science and co-researcher of Edsger Dijkstra (ALGOL) and Ole-Jørn Dahl(Simula) had the courage to admit it – billion dollar mistake. Though null reference was first proposed for Object-oriented Language implementations, it was first more visible in dataprocessing when relational databases supported the concept of NULL. Until then database implementations (IMS/IDMS etc) were happy with 0 or spaces. Though philosophically intuitive and elegant, NULL in database implementations did not serve any real practical purpose at all. In fact, just like Object-oriented Language implementations, it gave rise to variety of anomalies in SQL. The purpose of databases is to keep concrete information and not to play around philosophy! In SQL, 1+0 = 1 but 1+null = NULL and AVG(1,0,NULL) is 0.5 ! Min (1,0,null) is 0 and so on. And the NULLS in databases propagated to Java gave rise to further troubles in programming.

A seldom discussed issue is how the databases implement null behind the scene! Many thinks that NULL does not take any Space. In fact each nullable field has an exstra hidden field in the database system which keeps the information on null! When you use embedded SQL one needs an extra null-indicator byte for every nullable field and needs programmatic checking! In most relational database implementations, NULL is the default and one needs to specify NOT NULL in definition if you want to override. I have seen many databases where nullable field containing NULL without the application people not really aware of it and the dire consequences!

So 100% agree with Prof Hoare and the Young Author here – Paul Draper – that NULL indeed was a mistake. Huge mistake.

Great post. I was checking constantly this weblog and I’m inspired!
Extremely useful info specially the final phase 🙂
I care for such information a lot. I was seeking this particular info for a long time.
Thanks and best of luck.

Reinier Post has the correct answer, however few people know those mathematical terms. This blog post is promoting a so-called solution which actually is not the correct solution, and means lots of extra unnecessary work. The underlying problem is that most languages follow too closely the computer hardware’s limitations. An integer can only a numerical value, when in practice this is insuffient. There are simply too many instances where you want to store a numeric value and some other answers. Like when asking for someone’s age, you could have a number, or you could have “decline to state” as a valid answer. Most languages have null or undefined and a value as the only two kinds of numeric values. This is often not enough, and things get funky trying to reserve values or use -1 as meaning something special. Once you do some arithmetic on that -1 you suddenly change “decline to state” into a 3 year old…. Beads is the only language I know of besides Mathematica that can extend arithmetic, so that division by zero, infinity divided by infinity, etc. are all defined, not to mention all the special cases which are so nasty to program. None of the languages mentioned above, Kotlin, Go, Swift, Ruby, etc., do it correctly, and it is the slavish following of tradition that is holding us back. Time for the liberation of arithmetic from the ancient hardware limitations.

Arguably, Go should have one star since the typed nil interface nonsense generates infinitely many nils*. Typed nils go so far as to break transitivity of equality: https://play.golang.org/p/nRH6yJV0d6e

[…] First, what is nil? It’s just null as in other programming languages. But most other languages treat nulls as special values rather than an absence of the them. There’s a great talk about this, check it out here: NULL: The worst mistake of computer science? (2015). […]

[…] First, what is nil? It’s just null as in other programming languages. But most other languages treat nulls as special values rather than an absence of the them. There’s a great talk about this, check it out here: NULL: The worst mistake of computer science? (2015). […]

we replace it with Option in Rust… now instead of people having crashes everywhere due to null pointers, they have crashes everywhere due to people writing Option unwrap() everywhere. it has basically gone from having optional ‘if null’ guards at the beginning of every C++ function that rakes a pointer, to “assert(null)’ enforced by the compiler. thats basically what unwrap() is, assert on null.

in the end the invention of null was as important as the invention of 0. it is different but it fits in to our existing system of data and if we got rid of it completely we couldn’t do anything. it is how we deal with it that is important.

and as with 0 it will take us a while as a civilization to “Get there”. we still have disagreements with implementations of “division by 0” even thousands of years after it was invented. some people say it should be NaN, some Infinity, some think you should crash instantly, some just say covert it to 0. then we have geometric interpretations like mapping infinity to a finite point in another dimension, through projection.

as soon as mathematics invented 0, people had to deal with. as soon as mathematics invented the null set, people had to deal with it. as soon as people invented Addresses for things, there were invalid addresses. Even the post office has to deal with NULL, in the dead letter office.

inside a computer its just a bunch of circuits holding a voltage, after all. there isnt some system where every single input and output can be thought through entirely otherwise nothing would ever get done.

Interesting article, even though parts seem subjective and exaggerated, there are a few good points.

I’d advise to have a look at Kotlin and C#, both handle that pretty well. Kotlin probably better, even though for all the care it brought on the compiler side, it is still based on the Java bytecode. Thus it suffers from the type erasure issue and could have weak points at runtime that could bypass the nullable layer.

Java.Optional came too late, and it’s too dependant on the adoption, so unfortunately unless those features are embedded in the language from day 1, there is little that can be done.

And no matter what care you bring, there will be cases in which NULL is necessary and must be handled manually (see typical cases of UI callback and the necessity of the !! operator in Kotlin, for instance). So ultimately the user must pay attention – it’s their job after all, and NULL will remain a necessary evil.

The key-value store example is ridiculous.
The problem is not nil. The problem is that your model does not fit the return type of the store api.
You want a type that represents not-cached, no-number, number and misuse the store return type that represents not-in-store, valid-value.
The some solution just adds another state to the return type to fit your 3-state-model.
What if you need a 4 state return type for your model? Do you use some(some(‘000’))?
This problem has nothing to do with nil.

This is a great article, thanx for publishing it. But I am not sure we should throw “NULL” under the bus, just yet. It seem pretty canonical, and as I grab my copy of “pdp11 peripherals handbook”, on page B4 at the back, is the 7-bit Octal representation of the ASCII code from 000 (NUL) and 001 (SOH) up to octal 177 (which of course is DEL).

Check my “Gilman and Rose”, “APL – an Interactive Approach”, and page 304 explains how to use NULL to set up data-tables with embedded Nulls, but which print and it “takes NO time, just as if it isn’t there at all!” (used instead of the “idle” character.) Grab my Windows APL Plus-III for Windows (from Manugistics), and the []AV (Quad-AV), for the APL “Atomic Vector”, (256 chars long now), starts with NUL, which is also []TCNUL (a terminal-control character). Tying to get rid of the NULL character, seems like trying to get rid of the “U” in English. (Like the original Latin. Don’t need it, just use “V” instead, right?)

See, you left out my fav. language, which is APL. A sensible language like APL, which allows one to operate at a higher level of abstraction, avoids most of the issues you describe that can create problems by allowing NULL. A string in APL is a string, and can have any characters (including NULLs). An operator, called “rho” allows one to determine if the string is zero-length. APL data variables can be numeric or characters, and can be extended to any level of dimensionality that one wishes to use.

Really, you should no more be using “pointers” and mucking around with machine memory addresses, than you should be concerned about the voltage levels your cpu’s are using. The great mistake in computing was “C”, a weird, low-level retrograde step that is still causing grief.

APL examples require a special character set. I put up a simple example, of using a data-table that contains NULL characters, on my little website. It’s on the first page, and shows how the NULL character, since it does not create any output when displayed, can be useful in the construction of simple English sentances which have correct syntax. http://www.gemesyscanada.com

The APL used is Windows APL from APL-2000 (formerly Manugistics APL, and before that, it was STSC APL.)

Really, used safely, NULL can be kinda of cool. It’s there – but it’s not there. It’s an abstraction, yet it is real. Sorta like love and justice and freedom. 🙂