I’m looking into your driveway, which is empty. Can that driveway hold a Honda Civic?

Sure, no problem. In fact, it frequently does.

Now we cross the street and we look at my driveway, which is also empty. Can it hold a Honda Civic?

Yes, I suppose so.

In fact I have a rule that only Honda Civics may be parked in my driveway.

That’s a bit weird, but that’s your business.

Can the present contents of my driveway be placed in your driveway?

Um… I don’t know. That’s a strange question. On the one hand I want to say that the question presupposes a false premise, namely that “the present contents of my empty driveway” refers to an object that can be moved from one place to another. On the other hand, the contents of our two empty driveways seem to be by definition “the same”, namely, “zero Honda Civics”. If you insist on an answer then I suppose I will have to say yes, the present contents of your driveway fit into my driveway.

OK, so, does your driveway contain a Honda Civic?

Obviously not. It’s empty.

Well, aren’t you being inconsistent then? Let’s review the facts.

You agree that my driveway only contains Honda Civics.

You agree that the present contents of my driveway are identical to the contents of your driveway.

And yet you conclude that your driveway does not contain a Honda Civic!

Either you are being inconsistent or there is something wrong with this logic.

I’d agree with that! There’s something wrong with your logic.

And now we see why the is operator is actually consistent. The fact that a null reference may be assigned to a string variable does not make the null reference a string, any more than the fact that your driveway can be empty means that an empty driveway contains a Honda Civic. The is operator does not answer the question “can I assign this reference to a variable of the given type?” It answers the question “is this reference a legitimate reference to an object of the given type?”, and null is not a legitimate reference.

Someone once told me that the fastest way to spot the weak point in an argument is to look for the “surely”.

Actually, no. A common conception of types is that a type is a set, possibly infinite, of values, and that assignment compatibility is merely checking to see if a given value is a member of the necessary set. But that’s not the case in C#. The null reference actually is not assigned any type at all; it is explicitly of no type, but it is assignment compatible with a variable of any reference type.[1. In C# 1.0 and 2.0 the specifications said that the null reference was the sole member of a special “null type”, but this concept turned out to not be fruitful. You can’t declare a variable of the null type, you can’t use the null type as a generic type argument, basically it’s a type that you can’t actually use as a type. Mads deleted that concept from the C# 3.0 spec because it simply was not useful and caused more confusion than it prevented.] The assignment compatibility relationship and the type membership relationship are similar in a lot of ways but they are not identical.

Option types also sove the issue of “a string can be set to null, but null is not a string”. Because when you remove nullable references you can’t set a string to null. You can set a string? to null, or an option to null, but not a string.

You do not need to have a null value to have a turing complete language and C# (or any other language) could have been created without it and simply have used maybe monads (aka nullable) for those cases where you might not know the value/not have a value yet. F# hardly uses null, you can but it’s very uncommon to do so

Just imagine how much processing time is waste on the ‘something == null’ tests we have to do everywhere. All because you know for certain that someone, somewhere, will run your function and pass it null.

Let’s use Maybe<string> then, and since “Maybe a” is a monad, with a conventional syntactic sugar we don’t even have to perform case-checks!

Actually, it’s just would be the sort of short-curcuiting we already have for boolean values: “string f(string x) { if (x != null) return x.whatever(); else return null; }” would be written as “string f(string x) { return x.whatever(); }”, and point operator would have the following sematics: if a is null, the value of a.m() is null, otherwise, it’s the result of calling method m() of a. If m() has return type of void, a.m() is a no-op.

And what if passing in a null value should be an error? In this case the compiler would be blissfully hiding the problem (to potentially cause unexpected issues in unrelated code later). And programmers NOT doing the checks they should are the common cause of null errors and security hole (e.g. buffer overrun checks) as it is.

Now if the compiler required something like “Maybe f(Maybe x) { return x.whatever(); }”, it might be more acceptable as it explicitly indicates to the compiler/runtime that such implicit “translations” might be allowed. But even that case doesn’t account for business logic where the input could be optional, but that the optional return has nothing to do with the optional-ness of input (e.g. if the input is missing a constant value must be returned, and a value is not returned only if the input and/or something would generate an invalid return [perhaps due to a math computation]).

An implicit no-op may be worse if the case is “void f(mytype x) { x.performManditorySecurityCheck(); }”. Oops! The required security check was no-op’d.. meaning all code called after this point is potentially at risk (I would have preferred the run-time error if it stops a hack attempt).

Let’s be clear though, the problem is the lack of non-nullable reference types. The existence of nullable reference types isn’t any more a problem than the existence of nullable value types is. Nullability is a useful tool, it just sucks that we are forced to use it in places where we don’t want to.

When an array of a reference type is created, all the elements come into existence simultaneously. While it would be possible for a compiler or framework to require that all array elements be written before any could be read, and it might be useful to have immutable array types which enforce such a restriction (e.g. by having a constructor which takes a `Func` and calls the function once for each slot in the array, before an array reference is exposed to any code that might try to read the array), arrays are often used in contexts where it is in general not possible to statically ensure that every value which is read will have had something meaningful stored in it first.

Fundamentally, it is impossible to ensure that every array element will always have a meaningful value at the time it is read, since it’s possible that the information necessary to produce a meaningful value may not yet have become available. Having array elements contain something that is conspicuously not a reference to a valid object instance is more useful than requiring that elements be initialized to point to a valid instances even when constructing a meaningful instance would not be possible.

I just finished designing my own implementation of the Nullable class, as a thought exercise last week.

It really drove into my head the difference between “this is a perfectly valid instance of a ValueType” (ie a Civic) and “this is just plain nothing” (ie null).

Basically, anyone who is telling themselves that
string s = null;
results in a variable “s” that is a perfectly legitimate, usable string is telling themselves that
Civic c = null;
results in a perfectly legitimate, usable Civic.

Just because the language will let you does not make this true, and you’re not going to experience much fun sitting in your invisible car in your driveway here.

I guess I’m not surprised that they’re confused that
bool b = s is string; //but b is false!
because in this example they expect null to be a string. But when you really think about it, that isn’t true at all!

Null is the distinct *lack* of a string, lack of a Civic, etc. Hence my bringing up the Nullable generic.

And when you think about it THAT way, b == false makes PERFECT sense.

Though if one is still confused, I suppose you can always throw in the “as” keyword, break people’s minds some more 😉

I think the confusion is rooted more in how you read the code as an english sentence.
As it is read, you are not asking “null is string”. You are asking “s is string”. Since s was declared as a string, it’s only logical when reading it that you would get true back.

This case is just a small reminder that computers don’t always think like we do.

If s is declared as a string, I already know it’s a string, why would I bother making that question? The only useful way to interpret “is”, is to think that it asks about the content of the variable, not its declared type. Otherwise, you could have:
object o = “test”;
bool b = o is string;
Then b would be false, and that’s of no use at all.

IMHO, .NET should have defined, in addition to a string reference type, a string value type containing a single field of the string reference type. An variable of the value type whose internal field was null would behave as an empty string, rather than as a null reference. If the language keyword `string` aliased to the value type, then `string s = null;` would not compile, but `string[] s = new string[5];` would cause `s` to behave as an array holding five empty strings, rather than five null references. Making things work optimally would require implementing slightly-unusual boxing rules for the string type, but no worse than the rules for `Nullable`.

So now we have the same horrible annoyance as Oracle DB which treats ” and null as the same? How would someone differentiate between a known value that is blank and unknown value (and not pass a boolean “status” along with every relevant string, or make a wrapper object that has both, ugg)?

If your usage prefers having empty strings by default, then having some special syntax for specifying the default values of an array might be better. A fictitious syntax might be ‘string[] s = new string[5](“”)’.

So, being consistent, how would null be done for a custom class that has no [single] string to be empty (or any field that could have a “special” state).

I was under the impression that in type systems null is (implicitly) a subtype of all reference types. Similar to how Object is at the top of the type hierarchy, null is at the bottom. Everything is an object, nothing is null.

_Is_ ‘a Mammoth’ an Animal. Yes, since Mammoth is a subtype of Animal. _Is_ ‘a Null’ a String. Then also yes, since Null is a subtype of String.

A null value indicates the lack of a string. If null were to be a string, then you would be able to ask for its length. But (null as string).Length will fail, because there is no string.

It’s important to make a distinction between variables and values. If you declare a variable of type string without assigning a value to it, it’s value will be null, which is not a string. Or in the driveway analogy: if you reserve some space on your driveway for a Ferrari, it does not automatically give you a Ferrari. That empty space is not a Ferrari, unfortunately.

_Is_ ‘ a Mammoth’ an Animal yes it is (or at least was)
but what’s the answer to “_Is_ nothing an animal?”
The misconception you have is that null is an instance of a string (or anything else for that matter). Null is the _lack_ of an instance of anything.
there’s just one null. You are proposing that null is an instance of string when used as a string and instance of Form when used as a Form. For that to be the case we would either have to have multiple inheritance in C# (which we do not) or a null instance for each type. In the latter case we might as well have maybe monads then and no null.

It would be kind of like having a wall of bins that you can put boxes into (boxes are all the same, only their contents differ). Each bin is designated only for a certain type. When placing a box into a bin it is checked to compatible (i.e. not a different type). But of course you can put an empty box in a bin (and it is allowed since it is not “the wrong type”). As a result you _could_ take an empty box from one bin and stick it in another.

Using the driveway analogy, a gate/guard that blocks the driveway only allows Honda Civic’s to go through, but the “is” question being asked only looks at the parking spot itself and doesn’t know/care about what is allowed to get there.

So… you tells us that Val(X), the set of all values of type X, and the set { null } don’t intersect for any reference type X, ad that a variable of type X can hold a value from the (disjoint) union Val(X) ∪ { null }.

Man, that sounds awfuly like Maybe X, even the case-of construct is here (in form of if (x != null) { … } else {… }), except that I don’t get do-notation to short-circuit away all those annoying null-checks.

But then again, if one throws in subtyping, first- and second-order polymorphism, and mutability, things stop being easy and elegant rather quickly.

If you look at the argument from a SQL standpoint it’s because “NULL is meaningless”. In strict SQL you have to use the “IS” operator because “NULL NULL” (similar to infinity actually). In C# I am sure the design decision was to skip an operator dedicated to NULL and just use equals – but meaningless things don’t have a type.

In the example of your driveway, it’s not whether you have a car in the driveway – it’s more about whether you have a rip/hole in spacetime (nothing, void, nirvana, null) out front your house or some form of a driveway (be it tar, cobble or whatnot). Your driveway is a meaningless position until you put something in it (just like real computer memory is).

Finally, foregoing any logical justification, it’s because there is no v-table/i-table available for NULL values (because the v-table/i-table pointer is held in the first few bytes of the object header), so checking the v-table/i-table at runtime is impossible. Essentially: you can’t do this because of reality, not deliberate design. So that might be your answer: “that’s just how it turned out” (C++ has this “problem” as well).

Therefore ‘Car’ is a subset of null (although C# doesn’t let us say ‘foo is null’- which would be useful for looking for value types to be honest) – but null is not a subset of ‘Car’. One could say that to save memory null is represented by a zero-pointer as nothing more than an implementation detail (contrary to my former argument – but that was an analogy).

Oh, I have finally seen a perfect explanation of why SQL’s NULL is so troublesome: it’s because it’s meaningless! And when you throw a meaningless thing in a logic theory, this theory becomes inconsistent (and you don’t know what equals to what).

Speaking seriously, NULL has a pretty reasonable meaning in C#. It’s kinda like the bottom: it’s not a “valid” value, but it is a value, it’s distinct from any other value, there is no other “invalid” values, and of course, it’s equal to itself. Seriously, if “X != X”, you have an inconsistent theory on your hands, which is not very good. Of course, you can evade this problem by claiming that your programming language doesn’t, in fact, has equality operator, it has something very close to it, but not quite.

Nullity in SQL comes straight out of statistical usage. A null value is missing data. It is, if you would, a special value outside the domain of the variable indicating that the value is either missing or invalid. If the domain of a survey’s response variable is 1, 2 or 3 and somebody fills out a form with 4, from the standpoint of interpreting the response, the value is null. Ditto if somebody fails to answer the question. In fact, some stat systems (Osiris, for instance) have 2 null values, in order to differentiate between truly missing data and invalid data.

You agree that the present contents of my driveway are identical to the contents of your driveway.

And yet you conclude that your driveway does not contain a Honda Civic!

Statement 1 is incorrect – you agree that your driveway can only contain Honda Civics, NOT that it only contains Honda Civics. The difference is that statement 1, as presented, precludes the driveway from being empty – your driveway can contain a Honda Civic or nothing at all. Once this is taken into account, the other two statements make complete sense because the contents (nothing at all) are identical, and my driveway does not contain a Civic

In his example the “driveway” is just a “Honda pointer”, so having an empty driveway means that the reference has not been assigned yet (null). I think your hole is a little too literal about what null is. In this example there is no concept of a missing driveway (and there is no need for it). You could also have 2 driveways with the same car at the same time, which is not possible in the real world, but he is trying to use a simple case to clarify an idea.

Agree, I was reading to see if someone remembers that all this is related to pointers (references)
And you Samael hit the point!

C#, Java, any almost any language that uses new is a pointer base language.
I have disused about this with coworkers, I’m with the idea that C# just hides the pointer as we used in C++. Then everything is a pointer (in fact everything that comes from object)

In our discussion the string s = null, s is a pointer, a reference, that for sure is not an String.

Try
string s = null, string m = null, s == m? yes… bot are null pointers, the point to void/null/nothing.
but in the moment that you do s = “something” the s == m is false.

string s = null, this is only telling me that s can only point to object:string…
then s is string.. if no object then no, if an object has been assign then yes. Nothing wrong with the “is” when you really understand what ‘string s’ is

string s = null; // Clearly null is a legal value of type string
that is false, cause null is a legal value of a pointer/reference
default(string) would have some different manner

thinking on pointer your example of park spaces makes a lot of sense…
the park (pointer/reference) can have a car (object string) but for sure park is not a car… the IS must be read as CONTAIS
S is string? is more like Does S contains an string object?

Samael – forget about the driveway. I think you need to go to Computer Science 102. In C#, (referenced in article, and most other languages ) variable are placed in a symbol table. That symbol table has the variable name and an address. The variable’s memory address either has the actual value of the variable or a pointer to another memory address .
The first type of variable that has the value of the variable in the symbol tables referenced address is called a value type, ie int, bool floating point, structs) the second type of variable is called a reference type, strings, arrays, etc – all but the most simple types.

Although it is an compiler implementation decision – it may help to think of a null as a variable with no address assigned in the symbol table.

Wrong; your driveway can contain 0 or 1 Honda Civics. Since string is actually the composite type non-nullable-string|Null (note that Ceylon gets types right here by requiring string? for such a beast), null is a string.

A ‘type’ is a way to interpret values in memory. Implicitly this means a length and the imposed structure/meaning over that length. What do I make of 011101110001110? That’s the ‘type’ I impose on these bits. Since there is no memory allocation for null, there is no length, no bits, and no reason or meaning to imposing any type; ergo nothing null IS a string. I think this combines Eric’s driveway, the ‘meaninglessness’ argument, and the ‘go back to CompSci 101’ comment. It isn’t philosophy, it’s how digital computers work. Abstract it away or not, that’s what people like Eric decide for a given language. But, guilty myself, the ‘bad thing’ we do is use ‘null’ as an out-of-band flagging ‘value’ (usually asking “Should I take action on this parameter or not?”). Explicit nullable types (? types) really don’t do much more than make this ‘bad thing’ more respectable. Consider IsMissing for parameters in VB6…now that’s a little more honest than the ‘?’ types and null solution for flagging. Would it really be clearer or worth the overhead to develop a separate class with a flagging field when all I need is that single out-of-band flag? If so, then there’s nothing stopping me…or, for VB people, Nothing isn’t stopping me!

No, “how to interpret values in memory” isn’t called “type”. It’s called “data representation”. Those two concepts are highly tighted, but they’re not the same. I can use two completely different data representation for the values of the same type in the same program. Like, let’s look at struct Point { int x; int y; } — I can store it as three ints (the first for number 1, the second for x, the last for y), or as an int and two floats (the int for number 2, the first float for distance from the origin, and the last float for the azimuth). But in my programming language, I wouldn’t have any way to determine how the value of type Point would be stored: there is no sizeof, no pointers, nothing to give me a peek at the actual byte layout.

And since type is a purely syntactical thingie, ascribed to syntactical phrases, I have only one type here: struct Point. With two different representation on byte level, but that isn’t my problem as a programmer, that’s the problem of the compiler.

I see what you are saying, and I agree if by ‘type’ you absolutely mean the syntactic, or symbolic, entity, such as ‘String’ per se. Maybe we just come from different backgrounds on the definitions, though. I see data representation as the way real-world data are stored. This matches your Point example, since I can choose different coordinate systems to represent what is a point in (mathematically speaking) the real world. But while a compiler could represent instances regarded as of the symbol Point in different coordinate systems at random, then, say, emit IL/assembly to marshal among the different instances, that would seem like an odd way of doing things. Maybe I’m too C++ on this, but consider any array in C++ where pointer math is done based on sizeof, so for an array to be of any (value) type, then that specific type must have a defined size, which would make multiple different union-like same-size representations of instances regarded as of the same symbol Point seem even odder.
Don’t get me wrong, I love your abstract-down viewpoint, but I’d stick to my guns to say that, given perhaps another word other than ‘type’, at some point a compiler or runtime (or CPU) has to have a single possible interpretation of a bit sequence, and that if the representative of that interpretation represents the concept ‘null,’ then there is no meaning in saying that the representative IS of that interpretation, since there is nothing there to interpret.

First, let’s take a look at std::string. If you declare std::string x(“abcdefg”), 8 bytes will be allocated in the heap, and x will hold the pointer to them. But if you declare std::string y(“ab”), then no heap allocation takes place, and y contains “ab” right in it. It’s pretty common implementation.

Second, tables in database, and how VARCHAR/BLOBs are stored. I am pretty sure the trick I described above is used there.

And third, have you seen languages where strings and numbers cast each into other so easily and automatically that sometimes it makes you feel uneasy (PHP, Javascript)?

So using two representation of one type at the same time under the hood is not something unheard of. It’s just thought of in the line of “ad-hoc optimizing”, mostly.

The following snippet works perfectly:
static void test(object x)
{
string s = (string)x;
}
static void Main(string[] args)
{
KnightMoves movesMain = null;
// has to be assigned a value before C# allows to be passed to test
// too dumb to realize it is the same value the system would give it to begin with.
test(movesMain);

string can be assigned to KnightMoves a custom ref type I created. I can’t say:
string s = (string)movesMain;
in Main because it knows it is not compatible.
int[] x = new int[2];
test(x);
is allowed and blows up because the types are incompatible. If you didn’t put the = in the statement it also would work fine. Null is the same for everyone, same location, no information on type whatsoever.
You can’t put a subaru in the neighbor’s empty driveway because it’s not allowed. (Well, you can, but that’s because THAT rule is unenforceable.)

I think a good way to put this is what Alfred North Whitehead would refer to as the, “Fallacy of misplaced concreteness”, or reification. Null/nothing isn’t also something. It’s impossible to make statements about what’s in the drawer (other than “is there nothing in the drawer”, i.e. a null check) if there is, in fact, nothing in the drawer, because nothing is not something.

So “there is a sock in the drawer” is not a statement (which also implies it’s neither true or false) if the drawer is empty, did I understand you?

I was taught that “(forall a in X) P(a)” is true when X is an empty set: that’s how all unicorns are pink, and that’s also why the empty set is a subset of any other set. And if the example with unicorns is nothing of use, the example with the empty set is rather important and I doubt many mathematician would like to give up on it.

I think you misunderstood, or I’m misunderstanding you in turn. 😛 All I’m trying to say is that nothing is not a thing – it is, by definition, nothing, so you cannot compare it to something reasonably. “Is nothing a sock?” is not even a sensible comparison.

You are correct to point out that I badly misspoke though – it was just confusing, using the words “statements” and such. I’m not trying to bring in hardcore predicate logic here, just basic, everyday reasoning. 🙂

If nothing is in the drawer, the statement “there is a sock in the drawer” is false, of course!

You’re confusing the value type vs. the declaring type. The “is” operator tests the value, not the declared type. As has been noted on StackOverflow, if you want to test the declared type, use something like this:

And I thought this was more an implementation issue – rather than some strictly definition of assignment compatibility or type membership – because the only way the managed runtime knows the type of an object is by traversing the reference pointer to the object instance and accessing the “invisible” TypeHandle field. (see http://msdn.microsoft.com/en-us/magazine/cc163791.aspx for details).