Scala pre-SIP for your consideration

22 replies

Sat, 2009-01-03, 20:34

Andrew Forrest

Joined: 2008-12-20,

Hi,I’d been thinking through how an implementation of Scala might work with ‘null’ removed… having been attracted by non-nullability in Nice and other languages (the pure functional languages, and Spec# and F#).My main idea was that null in Java should equate to None in Scala. The largest difficulties are in ensuring that object constructors cannot leave any null references lying about, and deciding what to do about newly-allocated arrays of references. I think I have a solution to these, and I've written it up as a pre-SIP, here:http://dysphoria.net/scala/sip-nulls.xhtml

Be interested to hear what you think. Undoubtedly there are areas which could be clearer, and undoubtedly there are some corner cases which I haven’t considered.If you’re interested, I blogged about a previous version of this idea here: http://dysphoria.net/2008/11/22/removing-nulls-from-scala-some-thoughts/ It's less thought-through, (and in particular has a more awkward approach to arrays), but it is written more chattily, and possibly better explains my line of thought.Best regards,

Andrew Forrest wrote:
>
> Hi,
> ...
> My main idea was that null in Java should equate to None in Scala. The
> largest difficulties are in ensuring that object constructors cannot
> leave any null references lying about, and deciding what to do about
> newly-allocated arrays of references. I think I have a solution to
> these, and I've written it up as a pre-SIP, here:
>
> http://dysphoria.net/scala/sip-nulls.xhtml
>
> Be interested to hear what you think. Undoubtedly there are areas
> which could be clearer, and undoubtedly there are some corner cases
> which I haven’t considered.
> ...
> --Andrew Forrest
>

I'm surprised nobody else responded so far. I think most people would agree
that "null is annoying" or even "null is evil". However, there is a price to
pay for introducing this feature, and in practice null isn't such a big
problem in Scala, as we already try to avoid it where possible. On the other
hand, removing null would make the language easier. And of course Nice shows
us that it can be done in a JVM language.

It seems you have done a lot of research, and I really *hope* it would work
that way, but I don't have the experience to tell if this solution is
feasible. In case it is, breaking the compatibility is fine to me
(especially when we take the step-by-step approach outlined in the SIP).

Andrew,Like you, I intensely dislike null and would like to see it disappear from languages as a whole.But I think that your proposal would cause a ton of problems for Scala, existing Scala code, Scala/Java interoperability, and performance:

Existing Scala code supports null. While it is discouraged, the use of null exists today. About a year ago, there was some discussion about banning null from Scala. There were more than a dozen places where null elimination cascaded into a lot of work. There is now a years more legacy Scala code and scalac and the Scala libraries are no longer the largest single Scala code base. Break the ever-growing code base would be problematic.

While you addressed some of the Java interoperability issues, I don't think you addressed all of them. Most notably, the Array[T <: AnyRef] issue. This would have to turn into Option[Array[Option[T <: AnyRef]]]. Further, it would make publishing Scala traits into the Java world much more difficult in terms of trusting that the Java code would honor the non-nullable nature of parameters sent to Scala code.

The cost of boxing/unboxing code as it crosses the Scala/Java barrier is non-trivial both from a computational and code-bloat perspective. In Lift and in many of my apps, I make use of Java libraries. If I had to write Some(x) for every parameter I sent to Java-land, it would be pure hell. If there's an object-creation (Some) per parameter sent to Java code penalty for calling Java code, Scala gets much, much slower.

I appreciate and share your sentiment about null. I recognize that you did a ton of work putting together the SIP, but I think it's not practical and the gains I see for the SIP are small compared to the on-going costs.
Thanks,DavidOn Sat, Jan 3, 2009 at 11:34 AM, Andrew Forrest <andrew [at] dysphoria [dot] net> wrote:

Hi,I'd been thinking through how an implementation of Scala might work with 'null' removed… having been attracted by non-nullability in Nice and other languages (the pure functional languages, and Spec# and F#).
My main idea was that null in Java should equate to None in Scala. The largest difficulties are in ensuring that object constructors cannot leave any null references lying about, and deciding what to do about newly-allocated arrays of references. I think I have a solution to these, and I've written it up as a pre-SIP, here:
http://dysphoria.net/scala/sip-nulls.xhtml

Be interested to hear what you think. Undoubtedly there are areas which could be clearer, and undoubtedly there are some corner cases which I haven't considered.
If you're interested, I blogged about a previous version of this idea here: http://dysphoria.net/2008/11/22/removing-nulls-from-scala-some-thoughts/ It's less thought-through, (and in particular has a more awkward approach to arrays), but it is written more chattily, and possibly better explains my line of thought.
Best regards,

David,Never say never :)Thanks for your comments. To address your points (I've broken some of them out into multiple bullets):

Yes, absolutely, it is a big (proposed) change. However, I don't think that alone is an argument against it. Other programming languages have gradually deprecated, then removed features in the past. Certainly, given the magnitude of the change, I would expect, if implemented, that it would be a gradual migration, and at some point during it the language would have to go up a major version number.

I'm not entirely sure what you mean by the array issue. Yes, a (nullable) array of (nullable) references on the Java side could be interpreted an "Option[Array[Option[T <: AnyRef]]]" on the Scala side. The proposal explicitly allows for 'missing' values in arrays of non-nullable references, though, so probably (though I did not address the default interpretation of Java array types explicitly), the more natural interpretation would be "Option[Array[T <: AnyRef]]"... or even "?Array[T <: AnyRef]". If the Java code 'promises' not to return a null ref to the array, or not to expect one, you'd remove the outer option, so you'd be back to "Array[T <: AnyRef]"

'Trusting' Java code: If your method does not accept nulls, and Java passes you a null parameter, Scala will throw a NullPointerException before any of your method code is executed. (Which is also what the standard Java library methods do if they are passed an unexpected null.) If you want to allow missing values from Java, you must declare your parameter as an Option[T] instead of T. Scala code could pass 'None'; Java code could pass 'null'.

Having to write 'Some(x)' everywhere you want to pass a reference type parameter to a Java method which accepts nulls: Part of the proposal was an implicit conversion from T -> Some[T]. (I admit that it was buried in the middle somewhere!) That should ease the pain somewhat. Obviously the other way around, when you are ACCEPTING an Option[T] from Java and want to extract the T, you DO have to write the code to do the extraction... but then Java may have provided null, so you're only exchanging a null check for a None check.

Boxing and unboxing....

I disagree that this would make Scala less efficient. In fact, I think it would make Scala more efficient. Assuming we're talking about reference types here (for simplicity):

The unboxed representation of Some(x), (e.g., as stored in a variable of static type 'Option', or when passed to or received from Java code) would have exactly the same JVM representation as 'x'. So in....

var x = "Hello world" var some = Some(x) var an: Any = x

'x' and 'some' would have identical JVM type and implementation (though different static types). All 'Some()' does is change the Scala static type. It doesn't produce any bytecode.

'an' would be boxed. That is, in the JVM implementation it would allocate a new heap object of type "scala.Some". Of course, this is no more an overhead than you expect casting an Int, Boolean or Long to Any.

When you pass 'Some(thing)' to Java code, it remains unboxed, and is interpreted by Java as just a reference to 'thing'. When you pass 'None' to Java, it remains unboxed and is interpreted by Java as 'null'.

I expect that most, idiomatic use of Option would remain in the unboxed representation, and hence improve the efficiency of Scala code.

I was very aware that this proposal, if implemented, would have a large up-front cost (in terms of migration). However, if specified properly it should have little or no 'ongoing cost'. In fact, it aims to provide a tangible benefit for Scala programmers whose code has to talk to Java!Best regards,

--Andrew

On 4 Jan 2009, at 10:32, David Pollak wrote:

Andrew,Like you, I intensely dislike null and would like to see it disappear from languages as a whole.But I think that your proposal would cause a ton of problems for Scala, existing Scala code, Scala/Java interoperability, and performance:

Existing Scala code supports null. While it is discouraged, the use of null exists today. About a year ago, there was some discussion about banning null from Scala. There were more than a dozen places where null elimination cascaded into a lot of work. There is now a years more legacy Scala code and scalac and the Scala libraries are no longer the largest single Scala code base. Break the ever-growing code base would be problematic.

While you addressed some of the Java interoperability issues, I don't think you addressed all of them. Most notably, the Array[T <: AnyRef] issue. This would have to turn into Option[Array[Option[T <: AnyRef]]]. Further, it would make publishing Scala traits into the Java world much more difficult in terms of trusting that the Java code would honor the non-nullable nature of parameters sent to Scala code.

The cost of boxing/unboxing code as it crosses the Scala/Java barrier is non-trivial both from a computational and code-bloat perspective. In Lift and in many of my apps, I make use of Java libraries. If I had to write Some(x) for every parameter I sent to Java-land, it would be pure hell. If there's an object-creation (Some) per parameter sent to Java code penalty for calling Java code, Scala gets much, much slower.

I appreciate and share your sentiment about null. I recognize that you did a ton of work putting together the SIP, but I think it's not practical and the gains I see for the SIP are small compared to the on-going costs. Thanks,DavidOn Sat, Jan 3, 2009 at 11:34 AM, Andrew Forrest <andrew [at] dysphoria [dot] net> wrote:

Hi,I'd been thinking through how an implementation of Scala might work with 'null' removed… having been attracted by non-nullability in Nice and other languages (the pure functional languages, and Spec# and F#). My main idea was that null in Java should equate to None in Scala. The largest difficulties are in ensuring that object constructors cannot leave any null references lying about, and deciding what to do about newly-allocated arrays of references. I think I have a solution to these, and I've written it up as a pre-SIP, here: http://dysphoria.net/scala/sip-nulls.xhtml

Be interested to hear what you think. Undoubtedly there are areas which could be clearer, and undoubtedly there are some corner cases which I haven't considered. If you're interested, I blogged about a previous version of this idea here: http://dysphoria.net/2008/11/22/removing-nulls-from-scala-some-thoughts/ It's less thought-through, (and in particular has a more awkward approach to arrays), but it is written more chattily, and possibly better explains my line of thought. Best regards,

Thanks for the thoughtful and carefully written proposal! I agree that
it would be good to do something about eliminating nulls. I can also
see David P.'s point though: Changes that break a lot of code would
have to go into a different branch of the language (sort of like
Python 3000). Such forking is always expensive, and should not be done
lightly.

I have been thinking for a while now about a more gradual approach.
The idea would be to have a trait NotNull that, when mixed in, would
guarantee non-nullness. So the rule would be that
`Null' is a subtype of any class or trait inheriting from AnyRef,
unless that traits also inherits
NotNull.

In any case we have the problem how to ensure that fields and array
elements of NotNull type are initialized. It would be nice if we could
do this statically. You have elaborated one approach; for another
approach to ensure initialization statically see:

However I fear that this is still too experimental and that it would
be too limiting in practice.
So a simpler alternative would be to check for initialization holes
dynamically. It comes in handy that we have just excluded `null' from
the set of legal values, so we can treat it as `uninitialized'.
Dereferences of fields and array elements of NotNull type would have
to be tested
dynamically for non-nullness. If a null in such a field is detected,
it means that the field was not initialized so an
UnintializedFieldException (or something like that) would be raised. A
static analysis like the one you propose might still be useful, as an
optimization (dynamic checks can be eliminated) and to give (optional)
feedback to the programmer. But I would be doubtful about burning this
into Scala's type system.

I am still unsure to what degree we might merge this with the Option
type (maybe as you propose, maybe not at all), and what other type
inference mechanisms we might need.

I am also unsure what to do with the remaining types that do not
inherit from NotNull (I assume that would be typically Java types). Do
we still allow field and method selection on expressions of such
types? If we do, we disturb things the least, but we foregoe many of
the benefits of non-null checking. Another possibility would be to
issue a warning, unless the user explicitly
acknowledges rules out that `x' might be null, for instance by calling
the `get' method:

x.foo // would issue a warning that x is possibly null
(maybe that warning can be turned off)
x.get.foo // would issue no warning. Programmer has stated that x
is notnull.

These are at present only vague ideas, so please don't take them as a
design yet. They just show what the state of our deliberations is.

I have been thinking for a while now about a more gradual approach. The idea would be to have a trait NotNull that, when mixed in, would guarantee non-nullness. So the rule would be that `Null' is a subtype of any class or trait inheriting from AnyRef,

unless that traits also inherits NotNull.

I’d always thought of non-nullity as a property of the reference rather than the type itself. As a programmer I’d like to be able to declare a non-nullable java.lang.String, for example.How would the programmer decide what types should usefully be non-nullable? (Or would ALL Scala classes/traits be non-nullable unless they inherit from Java classes?)

In any case we have the problem how to ensure that fields and array elements of NotNull type are initialized. It would be nice if we could do this statically. You have elaborated one approach; for another approach to ensure initialization statically see:

I hadn't read that. Interesting. It has some similarities to the one I proposed: my (method) annotation @beforeConstructor[A] corresponds to their type annotation T^RAW(A). However, it deals more comprehensively with passing 'this' around from the constructor by putting annotations on types rather than instance methods. Does get quite hairy in the details, though (as does the one I proposed...)

So a simpler alternative would be to check for initialization holes dynamically. It comes in handy that we have just excluded `null' from the set of legal values, so we can treat it as `uninitialized'. Dereferences of fields and array elements of NotNull type would have to be tested dynamically for non-nullness. If a null in such a field is detected, it means that the field was not initialized so an UnintializedFieldException (or something like that) would be raised.

I'm beginning to prefer this idea. There are a lot of problems in proving that object fields are definitely assigned, (not least the Java Memory Model, which has special accommodations for ensuring that 'final' fields are initialised after the constructor exits, but which doesn't let you do the same thing for non-final fields, short of—as far as I can see—erecting your own memory fences before reads of fields).Having accessor methods check and throw an exception at runtime (instead of proving it at compile time) would be conceptually simpler and would put fewer restrictions on constructors (which is the biggest backwards-compatibility headache, I think with the static approach). It essentially pushes the responsibility for initialising fields (and dealing with concurrency issues) back onto the programmer.As you say, static analysis might be able to prove that some such checks can be omitted.The language might only insist that all non-null fields are 'definitely assigned' (in the Java sense) within the constructor, but without trying to stop up all the holes through which unassigned values can escape.

I am still unsure to what degree we might merge this with the Option type (maybe as you propose, maybe not at all), and what other type inference mechanisms we might need.

See, that's the aspect of it I'm most keen on—merging Option and 'nullable reference'! It seems to me that that's the biggest 'gap' between the Scala and the Java type systems.

I am also unsure what to do with the remaining types that do not inherit from NotNull [...] Another possibility would be to issue a warning, unless the user explicitly Acknowledges rules out that `x' might be null, for instance by calling the `get' method:

A bit like an Option then :)

It's great to hear that you're looking at this area. I look forward to hearing any other comments on it. Best regards,–Andrew Forrest

> I'd always thought of non-nullity as a property of the reference rather than
> the type itself. As a programmer I'd like to be able to declare a
> non-nullable java.lang.String, for example.

No problem: String with NotNull.

> How would the programmer decide what types should usefully be non-nullable?
> (Or would ALL Scala classes/traits be non-nullable unless they inherit from
> Java classes?)
>
No that's up to programmers. If they are smart and have few
interoperability restrictions they would make their root types inherit
from NotNull.

>> I'd always thought of non-nullity as a property of the reference
>> rather than
>> the type itself. As a programmer I'd like to be able to declare a
>> non-nullable java.lang.String, for example.
>
> No problem: String with NotNull.

> I'd always thought of non-nullity as a property of the reference rather than
> the type itself. As a programmer I'd like to be able to declare a
> non-nullable java.lang.String, for example.

No problem: String with NotNull.

Wouldn't be viable without better syntax.

The syntax should ideally be the same work as nullable types.

Perhaps better to invert the idea.

null is the subtype of all types inheriting from the Null-trait

so: String with Null can be assigned to a String or null

From my perspective, the earlier proposed syntax of "?" and "!" would be sufficient:

String? = String with NullString! = String (with NotNull)

But typing "String with NotNull" all the time would be boilerplatey.

Cheers,Viktor

> How would the programmer decide what types should usefully be non-nullable?
> (Or would ALL Scala classes/traits be non-nullable unless they inherit from
> Java classes?)
>
No that's up to programmers. If they are smart and have few
interoperability restrictions they would make their root types inherit
from NotNull.

and be done with it. The expectation is that you would only do this
with the root classes of your application and with key Java types.

>
> The syntax should ideally be the same work as nullable types.
>
> Perhaps better to invert the idea.
>
> null is the subtype of all types inheriting from the Null-trait
>
> so: String with Null can be assigned to a String or null
>
No, this would not work. The with connective produces subtypes not
supertypes. You can't fiddle with that without breaking the type
system. You'd need a new connective

String or Null

But that's a major addition to the type system.

> From my perspective, the earlier proposed syntax of "?" and "!" would be
> sufficient:
>
> String? = String with Null
> String! = String (with NotNull)
>
One can think of doing that, but I'd rather get the fundamentals right first.

Of course you could, but that would lead to all code samples looking different typewise, which would hurt adoption.

and be done with it. The expectation is that you would only do this
with the root classes of your application and with key Java types.

>
> The syntax should ideally be the same work as nullable types.
>
> Perhaps better to invert the idea.
>
> null is the subtype of all types inheriting from the Null-trait
>
> so: String with Null can be assigned to a String or null
>
No, this would not work. The with connective produces subtypes not
supertypes. You can't fiddle with that without breaking the type
system. You'd need a new connective

String or Null

But that's a major addition to the type system.

> From my perspective, the earlier proposed syntax of "?" and "!" would be
> sufficient:
>
> String? = String with Null
> String! = String (with NotNull)
>
One can think of doing that, but I'd rather get the fundamentals right first.

Of course the mechanics are very important, but without proper syntax they won't get used. :/

Don't get me wrong, I'm _very_ much interested in non-nullness, but for me it's important that the solution is the good solution.

Martin,Was thinking about the NotNull trait... I guess the main advantage of doing non-nullability like that would be that you dodge around backwards-compatibility problems; the only programmers affected by it would be ones who chose to use it.
However, it seems to me that making 'non-nullability' a property of the thing pointed to is quite problematic.What I mean is that NotNull is an arbitrary other property, unrelated to the other information held in the type. "Not being null" is not an inherent property of the problem domain. For example, as a programmer, I can decide if my class should be Ordered, or even Serialisable, but it seems an odd programming decision to say "pointers to this class (versus pointers to Strings or Ints or HashMaps) can never be null".
On the other side, as a user of classes, there is no logical way to guess if a type implements it (as far as I can see). And presumably the Scala standard library cannot implement across the board, for fear of breaking almost all existing code.
So the user of the types would end up having to put a "with NonNull" all over the place, just to ensure that references are non-nullable (and to document the code). Even with syntactic short-cuts (like writing "T!" for "T with NonNull"), it seems like you're putting the pain onto the user... as well as adding boilerplate for implementers of types, who feel they should inherit from NonNull for sake of good form. Or at least making them face a (potentially difficult) decision about who is allowed to use their class: only 'new' Scala code, or 'old' and 'new' Scala code.
(The other problem of course is that "String" has most of the same methods as "String with NonNull", so "var s : String = null; s.length" perfectly satisfies the type system but is guaranteed to throw a runtime exception.)
But it occurs to me, what about using a NoNullsAllowed trait--or some other kind of annotation--to indicate that uses of null WITHIN the class/trait are disallowed?

Better still---since Scala files commonly contain many small classes---you could declare it on the compilation unit itself, so it becomes a compiler pragma affecting the whole file. ('requires "nonnull"'? That 'requires' keyword doesn't seem to be doing anything :)
Essentially, Scala files with the 'nonnull' pragma would have a different interpretation of the type system than files without the pragma.For older files, 'null' would be an instance of any T <: AnyRef, as before. For newer files, 'null' would be disallowed. One could compile a Scala program consisting of newer, pristine, non-nullable parts and older, legacy, null-using parts.
There's the obvious complication of allowing the two to interoperate. So something like:* When viewed from 'old', nullable Scala, 'newer', nonnull Scala would appear to have a 'with NonNull' annotation on most of the types declared on its interfaces;
* When viewed from 'new' Scala, 'older', nullable Scala would appear to have 'Option' (or 'Nullable') wrappers round it the types on its interfaces.
They would be the same language (and share practically identical type systems), but the default interpretation, of a particular type definition---and the rules of what to allow the program to do with a potentially-null reference---would differ.
------8<----------8<----------8<----------8<----------8<----sealed class Nullable[T <: AnyRef] { def isNull: Boolean def get: T}
object Null extends Nullable { def isNull = true def get = throw new NullPointerException()}class NotNull[T <: AnyRef](reference: T) extends Nullable[T] {
def isNull = false def get = reference}------8<----------8<----------8<----------8<----------8<----

That way, new code for old systems could be written without nulls (as far as possible), and fresh-start Scala projects could be written without The Curse of Null at all---and wouldn't have to suffer the compatibility penalties for it.

It could also avoid a Great Language Fork. Null-using code could be supported indefinitely, so the body of existing Scala code could remain largely untouched.

I don't know how hard it would be from a compiler standpoint, but I like the idea of different code "zones" (perhaps on a package-by-package basis) such that there's a non-null zone and that zone adheres to the concepts in your SIP, a null-accepted zone which is the existing Scala mode, and a DMZ which the programmer guarantees is non-null, but the compiler does not enforce (this allows for easier bridging between worlds.)

Martin,Was thinking about the NotNull trait... I guess the main advantage of doing non-nullability like that would be that you dodge around backwards-compatibility problems; the only programmers affected by it would be ones who chose to use it.
However, it seems to me that making 'non-nullability' a property of the thing pointed to is quite problematic.What I mean is that NotNull is an arbitrary other property, unrelated to the other information held in the type. "Not being null" is not an inherent property of the problem domain. For example, as a programmer, I can decide if my class should be Ordered, or even Serialisable, but it seems an odd programming decision to say "pointers to this class (versus pointers to Strings or Ints or HashMaps) can never be null".
On the other side, as a user of classes, there is no logical way to guess if a type implements it (as far as I can see). And presumably the Scala standard library cannot implement across the board, for fear of breaking almost all existing code.
So the user of the types would end up having to put a "with NonNull" all over the place, just to ensure that references are non-nullable (and to document the code). Even with syntactic short-cuts (like writing "T!" for "T with NonNull"), it seems like you're putting the pain onto the user... as well as adding boilerplate for implementers of types, who feel they should inherit from NonNull for sake of good form. Or at least making them face a (potentially difficult) decision about who is allowed to use their class: only 'new' Scala code, or 'old' and 'new' Scala code.
(The other problem of course is that "String" has most of the same methods as "String with NonNull", so "var s : String = null; s.length" perfectly satisfies the type system but is guaranteed to throw a runtime exception.)
But it occurs to me, what about using a NoNullsAllowed trait--or some other kind of annotation--to indicate that uses of null WITHIN the class/trait are disallowed?

Better still---since Scala files commonly contain many small classes---you could declare it on the compilation unit itself, so it becomes a compiler pragma affecting the whole file. ('requires "nonnull"'? That 'requires' keyword doesn't seem to be doing anything :)
Essentially, Scala files with the 'nonnull' pragma would have a different interpretation of the type system than files without the pragma.For older files, 'null' would be an instance of any T <: AnyRef, as before. For newer files, 'null' would be disallowed. One could compile a Scala program consisting of newer, pristine, non-nullable parts and older, legacy, null-using parts.
There's the obvious complication of allowing the two to interoperate. So something like:* When viewed from 'old', nullable Scala, 'newer', nonnull Scala would appear to have a 'with NonNull' annotation on most of the types declared on its interfaces;
* When viewed from 'new' Scala, 'older', nullable Scala would appear to have 'Option' (or 'Nullable') wrappers round it the types on its interfaces.
They would be the same language (and share practically identical type systems), but the default interpretation, of a particular type definition---and the rules of what to allow the program to do with a potentially-null reference---would differ.
------8<----------8<----------8<----------8<----------8<----sealed class Nullable[T <: AnyRef] { def isNull: Boolean def get: T}
object Null extends Nullable { def isNull = true def get = throw new NullPointerException()}class NotNull[T <: AnyRef](reference: T) extends Nullable[T] {
def isNull = false def get = reference}------8<----------8<----------8<----------8<----------8<----

That way, new code for old systems could be written without nulls (as far as possible), and fresh-start Scala projects could be written without The Curse of Null at all---and wouldn't have to suffer the compatibility penalties for it.

It could also avoid a Great Language Fork. Null-using code could be supported indefinitely, so the body of existing Scala code could remain largely untouched.

On Tue, Jan 6, 2009 at 2:12 PM, Andrew Forrest wrote:
> There's the obvious complication of allowing the two to interoperate. So
> something like:
> * When viewed from 'old', nullable Scala, 'newer', nonnull Scala would
> appear to have a 'with NonNull' annotation on most of the types declared on
> its interfaces;
> * When viewed from 'new' Scala, 'older', nullable Scala would appear to have
> 'Option' (or 'Nullable') wrappers round it the types on its interfaces.

I quite like the sound of this, but how does it pan out for nullable Java?

I don't know how hard it would be from a compiler standpoint, but I like the idea of different code "zones" (perhaps on a package-by-package basis) such that there's a non-null zone and that zone adheres to the concepts in your SIP, a null-accepted zone which is the existing Scala mode, and a DMZ which the programmer guarantees is non-null, but the compiler does not enforce (this allows for easier bridging between worlds.)

Yeah, I think though that a DMZ (fun as the idea is :) isn't all that necessary... I'd say that on the null-allowed side, it's up to the programmer to Do The Right Thing. If it tries to pass nulls to a non-null bit of code, that will just fail--NullPointerException--at runtime. (That is, the non-null side of things has compiler-generated null checks on all received values---and internally, obviously, it's statically-checked.)
The tricky bits are how non-null Scala should intentionally pass a 'lack of a value' to, say, a Java method which actually allows 'null' and ascribes meaning to it. [My preference would be to use Scala 'Option' for that, but some alternate type-system shenanigans might work too.]
Cheers,--Andrew

Martin,Was thinking about the NotNull trait... I guess the main advantage of doing non-nullability like that would be that you dodge around backwards-compatibility problems; the only programmers affected by it would be ones who chose to use it.
However, it seems to me that making 'non-nullability' a property of the thing pointed to is quite problematic.What I mean is that NotNull is an arbitrary other property, unrelated to the other information held in the type. "Not being null" is not an inherent property of the problem domain. For example, as a programmer, I can decide if my class should be Ordered, or even Serialisable, but it seems an odd programming decision to say "pointers to this class (versus pointers to Strings or Ints or HashMaps) can never be null".
On the other side, as a user of classes, there is no logical way to guess if a type implements it (as far as I can see). And presumably the Scala standard library cannot implement across the board, for fear of breaking almost all existing code.
So the user of the types would end up having to put a "with NonNull" all over the place, just to ensure that references are non-nullable (and to document the code). Even with syntactic short-cuts (like writing "T!" for "T with NonNull"), it seems like you're putting the pain onto the user... as well as adding boilerplate for implementers of types, who feel they should inherit from NonNull for sake of good form. Or at least making them face a (potentially difficult) decision about who is allowed to use their class: only 'new' Scala code, or 'old' and 'new' Scala code.
(The other problem of course is that "String" has most of the same methods as "String with NonNull", so "var s : String = null; s.length" perfectly satisfies the type system but is guaranteed to throw a runtime exception.)
But it occurs to me, what about using a NoNullsAllowed trait--or some other kind of annotation--to indicate that uses of null WITHIN the class/trait are disallowed?

Better still---since Scala files commonly contain many small classes---you could declare it on the compilation unit itself, so it becomes a compiler pragma affecting the whole file. ('requires "nonnull"'? That 'requires' keyword doesn't seem to be doing anything :)
Essentially, Scala files with the 'nonnull' pragma would have a different interpretation of the type system than files without the pragma.For older files, 'null' would be an instance of any T <: AnyRef, as before. For newer files, 'null' would be disallowed. One could compile a Scala program consisting of newer, pristine, non-nullable parts and older, legacy, null-using parts.
There's the obvious complication of allowing the two to interoperate. So something like:* When viewed from 'old', nullable Scala, 'newer', nonnull Scala would appear to have a 'with NonNull' annotation on most of the types declared on its interfaces;
* When viewed from 'new' Scala, 'older', nullable Scala would appear to have 'Option' (or 'Nullable') wrappers round it the types on its interfaces.
They would be the same language (and share practically identical type systems), but the default interpretation, of a particular type definition---and the rules of what to allow the program to do with a potentially-null reference---would differ.
------8<----------8<----------8<----------8<----------8<----sealed class Nullable[T <: AnyRef] { def isNull: Boolean def get: T}
object Null extends Nullable { def isNull = true def get = throw new NullPointerException()}class NotNull[T <: AnyRef](reference: T) extends Nullable[T] {
def isNull = false def get = reference}------8<----------8<----------8<----------8<----------8<----

That way, new code for old systems could be written without nulls (as far as possible), and fresh-start Scala projects could be written without The Curse of Null at all---and wouldn't have to suffer the compatibility penalties for it.

It could also avoid a Great Language Fork. Null-using code could be supported indefinitely, so the body of existing Scala code could remain largely untouched.

On Tue, Jan 6, 2009 at 2:12 PM, Andrew Forrest <andrew [at] dysphoria [dot] net> wrote:
> There's the obvious complication of allowing the two to interoperate. So
> something like:
> * When viewed from 'old', nullable Scala, 'newer', nonnull Scala would
> appear to have a 'with NonNull' annotation on most of the types declared on
> its interfaces;
> * When viewed from 'new' Scala, 'older', nullable Scala would appear to have
> 'Option' (or 'Nullable') wrappers round it the types on its interfaces.

I quite like the sound of this, but how does it pan out for nullable Java?

Ah, yep, see my previous message to David :)I sketched out a mechanism (in http://dysphoria.net/scala/sip-nulls.xhtml) which I quite like whereby Scala represents Option and None, internally, as nullable references and null. So non-null Scala could call a Java method with a second optional (nullable) parameter as:
javaObject.javaMethod(parameter1, None)There would be an implicit conversion AnyRef=>Some[AnyRef] allowing you to write the following if you wanted to supply the second parameter (without having to type out "Some(parameter2)").
javaObject.javaMethod(parameter1, parameter2)

Seems quite 'neat' (and idiomatic), but don't know how hard it would be to implement.
--Andrew

On Tue, Jan 6, 2009 at 3:12 PM, Andrew Forrest wrote:
> Martin,
> Was thinking about the NotNull trait... I guess the main advantage of doing
> non-nullability like that would be that you dodge around
> backwards-compatibility problems; the only programmers affected by it would
> be ones who chose to use it.
> However, it seems to me that making 'non-nullability' a property of the
> thing pointed to is quite problematic.

I disagree. A type describes a set of values. Subtypes have fewer
values than supertypes. `null' is a value. Hence, you can exclude
`null' by going to a subtype. The power of Scala's type system comes
from keeping to such basic truths without introducing special cases
unless absolutely necessary (usually they are necessary for the sake
of Java compatibility). It has been my experience that *every* special
case you make comes back to haunt you later, in ways you never would
expect.

The whole idea of separating a type from the way one `points' to it
feels like C-style thinking. It surely is a special case in a Java or
Scala environment.

> What I mean is that NotNull is an arbitrary other property, unrelated to the
> other information held in the type. "Not being null" is not an inherent
> property of the problem domain. For example, as a programmer, I can decide
> if my class should be Ordered, or even Serialisable, but it seems an odd
> programming decision to say "pointers to this class (versus pointers to
> Strings or Ints or HashMaps) can never be null".

Why? Say you are designing a class `Text'. It's a perfectly valid
question whether you want to accept `null' as a Text. You might, or,
(more likely) you might not.

> On the other side, as a user of classes, there is no logical way to guess if
> a type implements it (as far as I can see).

Look at the supertypes?

> And presumably the Scala
> standard library cannot implement across the board, for fear of breaking
> almost all existing code.
> So the user of the types would end up having to put a "with NonNull" all
> over the place, just to ensure that references are non-nullable (and to
> document the code). Even with syntactic short-cuts (like writing "T!" for "T
> with NonNull"), it seems like you're putting the pain onto the user... as
> well as adding boilerplate for implementers of types, who feel they should
> inherit from NonNull for sake of good form. Or at least making them face a
> (potentially difficult) decision about who is allowed to use their class:
> only 'new' Scala code, or 'old' and 'new' Scala code.

Right. But it's still better than simply invalidating all code that's
out there now, no?

> But it occurs to me, what about using a NoNullsAllowed trait--or some other
> kind of annotation--to indicate that uses of null WITHIN the class/trait are
> disallowed?
>

That would mean introducing modes in the type system. I strongly
resist such an idea. Besides, it is the nature of types that it is not
always clear where a computation is made. It's not at all clear to pin
down a subtype widening to a given compilation unit.

> It could also avoid a Great Language Fork. Null-using code could be
> supported indefinitely, so the body of existing Scala code could remain
> largely untouched.
>
But at the price of a permanent language fork within Scala. Not clear
that's a win to me.

But would Animal without Dog exclude nulls? After all, Null is a subtype of Dog, so if Chihuahua gets excluded then surely Null gets excluded... Except that's not just impractical (Animal without Dog with Null seems painful to type) but impossible given the state of Scala's type system, because if Null is excluded then Nothing should be excluded too, but Scala has no way to guarantee that no Exception will be thrown.

I guess the rule would be: C is a subtype of (A not B) iff theres a path from C to A that doesn't go through B. So Null is a subtype of (Animal not Dog) because there's a path from Null to Animal that goes through Cat (or through Horse, or heck, through Animal itself). Chihuahua's only path to Animal, however, is through Dog, so that gets excluded. (Mmm, this makes traits interesting...) Animal not Null however, still excludes Null.

Anyway, I'm rambling. This seems interesting and more general than NotNull. But probably also significantly harder to implement.

On Tue, Jan 6, 2009 at 3:12 PM, Andrew Forrest <andrew [at] dysphoria [dot] net> wrote:
> Martin,
> Was thinking about the NotNull trait... I guess the main advantage of doing
> non-nullability like that would be that you dodge around
> backwards-compatibility problems; the only programmers affected by it would
> be ones who chose to use it.
> However, it seems to me that making 'non-nullability' a property of the
> thing pointed to is quite problematic.

I disagree. A type describes a set of values. Subtypes have fewer
values than supertypes. `null' is a value. Hence, you can exclude
`null' by going to a subtype. The power of Scala's type system comes
from keeping to such basic truths without introducing special cases
unless absolutely necessary (usually they are necessary for the sake
of Java compatibility). It has been my experience that *every* special
case you make comes back to haunt you later, in ways you never would
expect.

The whole idea of separating a type from the way one `points' to it
feels like C-style thinking. It surely is a special case in a Java or
Scala environment.

> What I mean is that NotNull is an arbitrary other property, unrelated to the
> other information held in the type. "Not being null" is not an inherent
> property of the problem domain. For example, as a programmer, I can decide
> if my class should be Ordered, or even Serialisable, but it seems an odd
> programming decision to say "pointers to this class (versus pointers to
> Strings or Ints or HashMaps) can never be null".

Why? Say you are designing a class `Text'. It's a perfectly valid
question whether you want to accept `null' as a Text. You might, or,
(more likely) you might not.

> On the other side, as a user of classes, there is no logical way to guess if
> a type implements it (as far as I can see).

Look at the supertypes?

> And presumably the Scala
> standard library cannot implement across the board, for fear of breaking
> almost all existing code.
> So the user of the types would end up having to put a "with NonNull" all
> over the place, just to ensure that references are non-nullable (and to
> document the code). Even with syntactic short-cuts (like writing "T!" for "T
> with NonNull"), it seems like you're putting the pain onto the user... as
> well as adding boilerplate for implementers of types, who feel they should
> inherit from NonNull for sake of good form. Or at least making them face a
> (potentially difficult) decision about who is allowed to use their class:
> only 'new' Scala code, or 'old' and 'new' Scala code.

Right. But it's still better than simply invalidating all code that's
out there now, no?

> But it occurs to me, what about using a NoNullsAllowed trait--or some other
> kind of annotation--to indicate that uses of null WITHIN the class/trait are
> disallowed?
>

That would mean introducing modes in the type system. I strongly
resist such an idea. Besides, it is the nature of types that it is not
always clear where a computation is made. It's not at all clear to pin
down a subtype widening to a given compilation unit.

> It could also avoid a Great Language Fork. Null-using code could be
> supported indefinitely, so the body of existing Scala code could remain
> largely untouched.
>
But at the price of a permanent language fork within Scala. Not clear
that's a win to me.

But would Animal without Dog exclude nulls? After all, Null is a subtype of Dog, so if Chihuahua gets excluded then surely Null gets excluded... Except that's not just impractical (Animal without Dog with Null seems painful to type) but impossible given the state of Scala's type system, because if Null is excluded then Nothing should be excluded too, but Scala has no way to guarantee that no Exception will be thrown.

I guess the rule would be: C is a subtype of (A not B) iff theres a path from C to A that doesn't go through B. So Null is a subtype of (Animal not Dog) because there's a path from Null to Animal that goes through Cat (or through Horse, or heck, through Animal itself). Chihuahua's only path to Animal, however, is through Dog, so that gets excluded. (Mmm, this makes traits interesting...) Animal not Null however, still excludes Null.

Anyway, I'm rambling. This seems interesting and more general than NotNull. But probably also significantly harder to implement.

On Tue, Jan 6, 2009 at 3:12 PM, Andrew Forrest <andrew [at] dysphoria [dot] net> wrote:
> Martin,
> Was thinking about the NotNull trait... I guess the main advantage of doing
> non-nullability like that would be that you dodge around
> backwards-compatibility problems; the only programmers affected by it would
> be ones who chose to use it.
> However, it seems to me that making 'non-nullability' a property of the
> thing pointed to is quite problematic.

I disagree. A type describes a set of values. Subtypes have fewer
values than supertypes. `null' is a value. Hence, you can exclude
`null' by going to a subtype. The power of Scala's type system comes
from keeping to such basic truths without introducing special cases
unless absolutely necessary (usually they are necessary for the sake
of Java compatibility). It has been my experience that *every* special
case you make comes back to haunt you later, in ways you never would
expect.

The whole idea of separating a type from the way one `points' to it
feels like C-style thinking. It surely is a special case in a Java or
Scala environment.

> What I mean is that NotNull is an arbitrary other property, unrelated to the
> other information held in the type. "Not being null" is not an inherent
> property of the problem domain. For example, as a programmer, I can decide
> if my class should be Ordered, or even Serialisable, but it seems an odd
> programming decision to say "pointers to this class (versus pointers to
> Strings or Ints or HashMaps) can never be null".

Why? Say you are designing a class `Text'. It's a perfectly valid
question whether you want to accept `null' as a Text. You might, or,
(more likely) you might not.

> On the other side, as a user of classes, there is no logical way to guess if
> a type implements it (as far as I can see).

Look at the supertypes?

> And presumably the Scala
> standard library cannot implement across the board, for fear of breaking
> almost all existing code.
> So the user of the types would end up having to put a "with NonNull" all
> over the place, just to ensure that references are non-nullable (and to
> document the code). Even with syntactic short-cuts (like writing "T!" for "T
> with NonNull"), it seems like you're putting the pain onto the user... as
> well as adding boilerplate for implementers of types, who feel they should
> inherit from NonNull for sake of good form. Or at least making them face a
> (potentially difficult) decision about who is allowed to use their class:
> only 'new' Scala code, or 'old' and 'new' Scala code.

Right. But it's still better than simply invalidating all code that's
out there now, no?

> But it occurs to me, what about using a NoNullsAllowed trait--or some other
> kind of annotation--to indicate that uses of null WITHIN the class/trait are
> disallowed?
>

That would mean introducing modes in the type system. I strongly
resist such an idea. Besides, it is the nature of types that it is not
always clear where a computation is made. It's not at all clear to pin
down a subtype widening to a given compilation unit.

> It could also avoid a Great Language Fork. Null-using code could be
> supported indefinitely, so the body of existing Scala code could remain
> largely untouched.
>
But at the price of a permanent language fork within Scala. Not clear
that's a win to me.

Jorge Ortiz wrote:
> Forgot to mention: it might also lead Steve Yegge to complain that
> Scala has type type type types.
One may safely ignore most of Yegge's nonsense, but sadly, this is not
always heeded to the detriment of his victims. It would be a shame
should the Scala language pander to under-qualified commentators
(Yegge, Beust, van Rossum)?

supported indefinitely, so the body of existing Scala code could remain

largely untouched.

But at the price of a permanent language fork within Scala. Not clearthat's a win to me.

Any change to the language (or libraries) is a fork of sorts; there's the 'newer (hopefully) better' way of doing things and the 'older, we wish you'd stop doing it like this' way.I understand your concern about having two not-quite-the-same type systems in the one language, though.What if the only difference between them was that in the 'new' type system, 'null' was not a bottom for AnyRef, but a simple value of type 'Nullable' (a value type—essentially similar in spirit to an Option—which can contain a reference to an object, or else be null.) Most Java methods take or return Nullable[SomeTypeOrOther], of course. In the old type system it completely ignores 'Nullable' when testing for whether types conform. To it, a SomeType[Nullable[SomeOtherRefType]] seems just to be a SomeType[SomeOtherRefType]. In the old type system of course, of course 'null' is a bottom type, (and the programmer can assign it to any T <: AnyRef, as currently).The pragma we talked about would then ONLY switch on or off 'visibility' of Nullable and the 'bottomness' of null.Old code would be able to assign 'null' to any reference type.Newer code would only be able to assign 'null' to a Nullable wrapper to the reference type.Old code would be able to accept 'null' from any reference type.Newer code would only be able to accept 'null' where the interface wrapped by a Nullable.Old code, significantly, wouldn't be able to see the Nullable wrappers, and therefore would continue to work completely unchanged. It's up to the programmer's discipline to enforce the nullable/non-nullable distinction.Newer code would be able to see the Nullable type, and would enforce the distinction between things-to-which-one-is-able-to-assign-null, and things-to-which-one-is-not-allowed-to-assign-null.

Sadly, my brain is tired (and I see there are a couple of other emails on this thread too). I've thought through this variant a bit, but not completely, so it might be complete mince. Will work through a couple of examples when I have more time, if it would be useful.

That would mean introducing modes in the type system. I strongly

resist such an idea. Besides, it is the nature of types that it is notalways clear where a computation is made. It's not at all clear to pindown a subtype widening to a given compilation unit.

Just reread that bit. So am guessing that "keep two modes, but just simplify how they work" is still not going to wash :)

On Tue, Jan 06, 2009 at 01:14:22PM -0800, David Hall said
> 3. Remove the runtime representation of Option[T] and None as classes
> -- Some(t) maps to T, None maps to null. Some(null) throws an
> IllegalArgumentException, I'm not sure what do with Some(Some(t)) and
> its ilk.
> 4. Add an implicit conversion from T<:AnyRef to Option[T], with t!=
> null => Some(t), and None otherwise.
> 5. For calling a java api (if it can be detected?), allow the use of
> Option[T] anywhere a T is expected.
> 6. Let Some.unapply(t) return t, and None.unapply(null) return None.
> These two together should form an exhaustive pattern match.

I can't see how any attempt to change the representation of Option[T] to
a bare nullable reference would work. Imagine some code like the
following:

Maybe doSomething(..) does something useful when passed a null value.
Any proposal that I've seen to optimize Option[T] in this way does not
allow this concept to be written. The bottom line is that there are
actually cases in which Some(null) is needed and is very similar to
the problem you mention of nested options (i.e. Some(Some(x)) )

On the other hand I think it would theoretically be possible to encode
Option[T <: NotNull] as a bare nullable reference as long as T is not
Option[_] (because then you hit the nested issue).