null
from the languageI call it my billion-dollar mistake. It was the invention of the null reference in 1965. —Tony Hoare, 2009
In Java (and similar languages), any reference type may hold the value null
.
null
means ‘no object referred to’, and any attempt to dereference null
causes an error, (a NullPointerException
, or NullReferenceException
in .NET).
null
is used for two purposes:
Scala provides a better, type-safe mechanism for the latter purpose, called Option
(with subtypes Some
, for present values, and None
for missing ones). Idiomatic
Scala code uses Option
instead of testing for null values. Unfortunately, in order to maintain interoperability with Java, Scala also supports null
.
Consequently Scala is also prone to NullPointerException
s.
This proposal addresses removing null
from Scala entirely, while still maintaining the
ability to use Java code and be called from Java code, with minimal boilerplate, while improving Scala’s runtime efficiency. It aims to require as few changes to the Scala language as possible.
For ease of exposition, the following conventional terms are used:
AnyRef
, and in Java a subclass of java.lang.Object
.AnyVal
and in Java any primitive type.Option
, None
or Some
. (Note that we discuss below
making Option
a subclass of AnyVal
.)null
.Note that the discussion here is centred on the JVM
implementation of Scala. However, except where noted, it applies equally to the .NET
CLR implementation. For ‘Java’, ‘JVM’ and
‘NullPointerException
’, feel free to substitute ‘C#’, ‘CLR’ and
‘NullReferenceException
’.
The Scala type Option
(and the object None
) has much in common with nullable
reference types
(and the value null
), to the extent that in the absence of an Option
type, nullable
references can in many cases be used instead (and in Java code are used instead).
At the risk of stating the obvious:
Use case | Option (Scala) | nullable (Java) |
---|---|---|
Declare an optional string value. | var aName: Option[String] |
String aName; |
Assign a concrete value to a variable | aName = Some("Some value") |
aName = "Some value"; |
Assign the absence of a value to a variable | aName = None |
aName = null; |
This equivalence is not complete. Option
s are richer in that:
Some(Some(value))
is valid.
Moreover Some(None)
is distinct from None
.None
is an ordinary object with methods. It does not throw an exception
when dereferenced, (but null
does).The keyword null
and type Null are entirely removed from the Scala language.
(Consequently, reference type variables and array elements no longer have a default value.)
Option
is redefined as a value type (a subclass of AnyVal
), with the
default value None
.
Further sections below address issues such as: initialisation of reference variables; array semantics; Java compatibility and migration of existing Scala code.
Option
as a value typeWe (re)define Option
as a value type. This is so that:
Option
is not constrained by Java Object semantics, and does not need
to maintain its object identity.Option
s must have a value.Option
may have a default value (None
, unsurprisingly).Option
is defined to have a boxed and an unboxed form. The unboxed form
achieves compatibility with Java by representing an Option[RefType]
as a nullable
reference to RefType; the boxed form allows the additional rich semantics of the
Scala Option
type, and allows Option
s to be treated as normal objects
including being cast to Any
.
When an Option[RefType]
is unboxed, None
is represented as JVM null
and Some(refVal)
is represented as a plain JVM reference to refVal.
Additionally, in a CLR implementation, an Option[ValueTypeExceptOption]
is unboxed as a System.Nullable<ValueTypeExceptOption>
.
The unboxed representation of Option[ValueType]
must have a JVM default
value which is interpreted as None
. This is important for array semantics (below). Otherwise, the unboxed representation of an Option[ValueType]
is
not specified here. Potentially it could consist of two slots, (ValueType, Boolean)
. In the case of nested Option
s, Option[Option[…[BaseValueType]…]]
, the representation could consist of (BaseValueType, Int)
. Alternatively, it could be represented as a nullable reference to the boxed ValueType.
When boxed, an Option
is represented as an instance of the JVM class Some[T]
or as the JVM object None
.
The implementation is generally free to choose boxed or unboxed representations, depending
upon ease of implementation or efficiency. However, an Option
is always boxed when it is cast
to a type higher than Option
. Conversely, an Option
is always unboxed when a) it contains
a reference type (in a CLR implementation, any type other than Option
)
and b):
Array[Option[String]]
is represented as the JVM type [Ljava.lang.String
.Because variables are not now allowed to hold the value null
, and because, as a consequence, reference type variables do not now have a default value, reference type fields must be initialised to a definite value by the program.
This requires that:
val
and var
);Within a constructor (object/class body), every non-abstract field must be assigned an (initial) value. This rule is already enforced for ‘val’ variables, but we here extend it to ‘var’ variables too.
Variables need not explicitly be assigned a value; if the type has a default value, and if the program does not explicitly supply a value, it is implicitly assigned the default value for the type. Any such implicit assignments are considered to occur before the call to the superclass constructor.
All value types have a default value; no reference type does. [See Appendix B for a proposal to allow types other than value types to have default initial values.]
Any expression called by the constructor (that is, variable values, or expressions which form part of any statement within the object/class body, and not within a ‘def’), may only refer to variables previously assigned a value.
Any constructor expression may only call methods which refer to variables already declared before the call, and which call other methods which refer to already-declared variables, (and so on recursively). This is complicated by the presence of subclasses which may override the methods. We outline a mechanism of enforcing this below.
The initialisation sequence of an object of class C with super-class B and super-super-class A is as follows:
C early definitions |
B early definitions |
A early definitions |
A constructor |
B constructor |
C constructor |
Early definitions are already restricted such that they may not refer to uninitialised fields, nor call any methods, (nor leak references to this
).
We hereby change the language rules for constructors to also prevent them from reading uninitialised fields (and from leaking references to this
), while maintaining some programmer flexibility, chiefly the ability to call (overridable) helper methods.
Note that in the diagram above, A’s constructor can safely read (at least) the variables initialised by all of the early definitions. B’s constructor can safely read (at least) the fields initialised by all of the early definitions, and all (concrete) fields inherited from A. C’s constructor can safely read (at least) the fields initialised by all of the early definitions, and all (concrete) fields inherited from A and B.
We declare two method annotations @inConstructor
and @beforeConstructor(T: Class)
. The first indicates that a method is called from within the current class’s constructor, possibly after some fields have been assigned, but before others have been. The second indicates that the method may validly be called before the constructor for class T has started (and therefore refers to none of T’s fields, except those assigned in early definitions and by super-class constructors).
Expressions within the constructor for class T, and any methods annotated @inConstructor
(for T) must obey the following rules:
@beforeConstructor(classOf[U])
where U >: T
, or@inConstructor
(for class U) where U is a strict superclass of T, or where U≣T and it can be statically proved that it only reads fields assigned before the first point at which the call is made, or@inConstructor
annotation.@beforeConstructor(classOf[V])
where V >: T. (That is, an override of the method is more restricted, as it should not need to know the order in which its super-class initialises its fields.)this
. (See next section.)Any expressions within the call to the superclass constructor, and any methods annotated @beforeConstructor(classOf[T])
must obey the following rules:
@beforeConstructor(classOf[U])
where U >: T
, or@inConstructor
(for class U) where U is a strict superclass of T, or@beforeConstructor(classOf[T])
annotation.@beforeConstructor(classOf[V])
where V >: T.this
. (See next section.)As a special case, if any abstract field in class T is read from a constructor expression in T, an expression within the call to T’s super-class constructor or from any method annotated @inConstructor
for T or @beforeConstructor(classOf[T])
, the first concrete implementation of the field must be assigned a value in an early definition. This could be achieved by annotating such abstract fields with a new annotation @assignEarly
.
As a consequence of how the Java memory model treats constructors, constructors should avoid passing the object-under-construction to a
scope visible to other threads. Otherwise they risk that other threads may see the object in a state other than the constructed state (for example, final fields may appear to change their value), thus violating class
invariants. As we here specify an additional constraint, that reference variables never be JVM null
, there is an additional risk
with under-construction objects: that it becomes possible to observe the JVM null
before it is
replaced by the variable’s correct initial value.
Therefore we propose that passing the under-construction object outwith the scope of the class be disallowed. (Note that this change, more than any other proposed here, is likely to be the most onerous, and require most invasive changes to existing Scala programs.)
The constructor, and any method annotated with @inConstructor
or @beforeConstructor
, is disallowed
from passing a reference to this
to any method (or object constructor). This rule additionally disallows passing inner object or class instances which
dereference their implicit reference to the class-under-construction.
Note that we allow references to closures and anonymous functions within the
constructor (and special methods), so long as the closure body either a) follows the rules as regards referring to assigned variables, and not passing out this
, or b) is
not itself invoked from within the constructor.
Note that lazily-evaluated variables may be used to defer a ‘dangerous’ use of this
until after the constructor has completed.
If an object needs to register a newly-constructed object, or store it in a global variable, this must be done after construction is complete. One possible pattern is:
class MyClass(/* args */) protected { // Perform construction here. Cannot pass out ‘this’ or store it // in global state. (Constructor is declared as ‘protected’ to prevent // clients from calling it directly.) protected def initialize: this.type = { // Perform any post-constructor initialisation here. // CAN pass references to ‘this’, or register UI listeners. return this } } object MyClass { // Clients should call this method to construct MyClass instances // (instead of using ‘new’ directly): def apply(/* args */) = new MyClass(/* args */) initialize }
…where new objects would be constructed by clients using the companion object—
MyClass(/*args*/)
—instead of using new
and calling the constructor
directly.
Importantly, the constructor and the initialize
method may be overriden independently by subclasses. When initialize
is called the programmer knows that all of the constructors have run to completion.
It would be possible for the language to sweeten this construct with some syntactic sugar, but in the spirit of changing the language as little as possible, no such syntax is proposed here.
The AnyRef.finalize
method allows an object holding (usually native) resources to release these resources when the object is garbage collected.
As explained elsewhere if a constructor throws an exception, a finalize
method called on that object could see a partially constructed object. If the object has not been fully constructed, some of its reference type fields could be unassigned (null
in the underlying representation). However, the finalize
method may legitimately need to access some of these fields.
This implies that:
finalize
method does not directly or indirectly throw JVM NullPointerException
s, while somehow letting it inspect the entire state of the object so that it can perform its cleanup task; andfinalize
method may not allow the, possibly incompletely constructed, object to escape, in case it violates its class invariants. (In any case, it is regarded as bad form for the finalize
method to resurrect an object once the garbage collector has marked it for disposal.)The finalize
method in AnyRef
is annotated with @calledByFinalize
. Methods with this annotation:
@calledByFinalize
annotation.this
to other methods or constructors (because this
may not be a properly constructed object). Consequently Scala objects are prevented from ‘resurrecting’ themselves after garbage collection.@calledByFinalize
.scala.UninitializedFieldError
(not NullPointerException
), and it throws the exception when the variable is fetched, (not when it is dereferenced).We define a new method in Predef, getIfInitialized[T <: AnyRef](v: => T): Option[T]
. This can be used to ‘wrap’ an access to a possibly-uninitialised reference field, v, returning None
if it is uninitialised, and Some(v)
if it is initialised. It could be (inefficiently) implemented as follows, (but would ideally be an intrinsic function):
def getIfInitialized[T <: AnyRef](v: ⇒ T) = try { Some(v) }catch{ case UninitialisedFieldError ⇒ None }
We note here that it is always possible to subvert these safeguards via Java subclasses or superclasses of Scala objects. For example, a Java subclass of a Scala class with a method annotated @inConstructor
could override that method to call another, arbitrary method on the class, breaking the contract, and potentially causing a NullPointerException
to be thrown from within Scala code.
Given that the JVM class-verifier does not enforce Scala rules, this is impossible to prevent. In addition, Java code can already subvert the type-soundness of other parts of the Scala language.
The section above outlines rules to ensure that every instance variable has a defined (non-null) value whenever it is dereferenced. However, these rules cannot ensure that array elements have defined values when dereferenced.
Some non-nullable language proposals have
specified that arrays be limited to nullable (in our case, Option
) types. However, this
poses a problem for value types; we should be able to create an array of value types (not
necessarily Array[Option[ValueType]]
), and we should like to avoid different restrictions for value and reference type arrays.
Also, there is much existing code which uses arrays, including most implementations of collection classes. We should avoid changing array semantics excessively.
Hence we modify array semantics slightly for reference-type arrays, allowing arrays of
reference types to have ‘missing’ values. That is, an array is redefined, from a function
defined over all of the elements [0, length), to a function which is allowed to be
undefined at some indices. Indeed, an array of reference type is initially undefined
at all indices. (A missing array element is naturally modelled in the underlying JVM
implementation as a null
value.)
For value-type arrays, newly allocated arrays initialise each element to the default value for the type, as currently. (For the avoidance of doubt, value-type arrays never have missing values. A value-type array is a function defined at all of its indices.)
This rule for initialisation of value-type arrays extends to arrays of Option
s. As mentioned
above, arrays of Option[RefType]
, store their elements unboxed. A new instance of such an array
has all of its elements initialised to None
. (This is implied by the JVM representation.)
The following Array[A]
methods are introduced or redefined:
apply(index:Int): A
isDefinedAt(index)
would return true. If
isDefinedAt(index)
would return false, throw a
NoSuchElementException
.update(index:Int, newValue:A): Unit
null
.) Note that this implicitly marks
array cell at index as containing a value.isDefinedAt(index:Int):Boolean
length
, and if
the representation does not have a JVM null
at array slot index.clear(index:Int): Unit
Option
arrays,
sets it to None
.) For reference-type arrays, makes that array slot empty
(such that isDefinedAt
would return false). This method is necessary to allow
implementers of collections classes to remove references to contained objects.++[B >: A](that: Iterable[B])
this
Array plus the
elements of that
Iterable. This is the same result as would be returned by
Array.concat(this, that)
. Note that undefined elements will have been removed
from the resulting sequence, so elements in the returned array may not appear at the same
index as they did in the original this
Array. This behaviour is subtly different from
that in old, nullable Scala.elements: Iterator[A]
isDefinedAt
would return true
. That is,
for(e <- array.elements) yield e
produces the same result as
for(i <- array.indices) yield array(i)
indices: Iterator[Int]
isDefinedAt
would return true
.map[B](f: (A) ⇒ B): Array[B]
Importantly, the result Array will have the same length asdef map[B](f: (A) ⇒ B) = { val result = new Array[B](this.length) for(i <- this.indices) result(i) = f(this(i)) result }
this
array.zip[B](that: Array[B]): Array[(A, B)]
Importantly, the result array is always the minimum of the lengths of the two source arrays and contains missing elements where either of its 2 source arrays contained missing elements.def zip[B](that: Array[B]) = { val result = new Array[(A, B)](Math.min(this.length, that.length)) for(i <- this.indices if that.isDefinedAt(i)) result(i) = (this(i), that(i)) result }
Having removed null
from the Scala language, a Scala method (or function)
with the signature (R1) ⇒ R2
where R1 and R2 are reference types
Moreover, when Scala code calls a Java method, R2 method(R1 p)
, where R1
and R2 are reference types, it will never pass a null argument, and cannot accept a null
result.
As described above, nullable Java references to R may be treated as if they were of type
Option[RefType]
, mapping Some(refVal)
to non-null Java reference and None
↔ null
.
However, Java methods may be specified as not accepting or not returning null
(informally, in
documentation, or more formally, using annotations such as @Nonnull
). In this
case we would like to be able to pass and receive non-nullable references to and from Java
code without wrapping them in Option
s.
There are 4 cases:
In lieu of any @Nonnull
annotations, the Scala compiler shall assume that a
Java method parameter of a reference type R is nullable, and is therefore represented in Scala
as an Option[R].
For example, Java method
File[] listFiles(FileFilter filter)
will appear to Scala code to have the signature:
listFiles(filter: Option[FileFilter]): Option[Array[File]]
An implicit conversion shall be provided in Predef from AnyRef to Option[AnyRef] to ease writing Scala code which calls Java code with nullable parameters.
implicit def anyRef2Option(ref: AnyRef) = Some(ref)
When Java code has a @Nonnull
annotation on a method parameter, the parameter
will appear to Scala code to be of an ordinary, reference type. The Scala compiler guarantees
never to pass a null reference to such parameters. For example, the Java method:
void fillPolygon(@Nonnull Polygon p)
will appear to Scala to have the signature:
fillPolygon(Polygon p): Unit
Conversely, Scala methods which return results of a type Option[RefType]
appear to Java code to
return ordinary nullable references of RefType. Java code will receive null
when
Scala returns None
and will receive someValue when Scala returns Some(someValue)
.
Scala method which return results of a type RefType also appear to Java code to return ordinary
nullable references of RefType. However the Scala method is guaranteed never to return null
to the Java code.
Note that if Java accepts a nullable Object
reference (which Scala interprets as Option[Any]
), and Scala provides an Option[ValType]
value,
(where ValType may be a further Option
), Scala type rules (specifically the fact that Option
is
covariant in its type parameter), imply that the Scala value be cast from an Option[ValType]
to Option[Any]
, (which is represented in the JVM as a nullable reference to Any
). Essentially in this case the contents of the Option
must be boxed before passing them to Java code.
This corresponds to the .NET rules for boxing Nullable<ValueType>
values; the inner ValueType value is boxed when casting to a (nullable) Object
type.
When a Scala method is defined as taking a parameter of type Option[RefType]
,
it will appear to Java to take a normal nullable reference to RefType*. Java code may safely
pass null
, which will appear to the Scala code as None
.
When a Scala method is defined as taking a parameter of type RefType, it will
also appear, to Java code, to take a normal (nullable) reference to RefType. However, the Scala
code cannot accept null values. The Scala compiler could annotate such parameters as @Nonnull
, but it must emit null-checks for each such
parameter at the top of the method. This null check throws a
NullPointerException
if the argument is JVM null
.
Note that this null check could be omitted if the compiler can prove that the method is only ever called from Scala code (based on the visibility of the method and its class).
Note also that the requirement is that the potential NullPointerException
is thrown before executing any Scala code in the method. The compiler can possibly optimise this away if all of the reference type parameters are dereferenced before the method does anything else.
*(An important implication of this is that a Scala object may not have two methods overloaded
on the same name, whose parameters differ only in whether or not they take Option[RefType]
or
RefType—since such types are represented by identical JVM types. We address a possible way
round this in Appendix C.)
In the absence of any @Nonnull
annotation on the Java method, it can be assumed
to return a nullable reference. Such a method appears to Scala code to return a value of type
Option[RefType]
, which the Scala code likely pattern-matches to test for a value. For example,
Java method
Object Dictionary.get(@Nonnull Object key)
appears to Scala code to have the signature
get(key: AnyRef): Option[AnyRef]
If the Java method is declared (via @Nonnull
annotation) not to return null
, the Scala
compiler represents it as returning a plain reference. The Scala compiler must emit code to
check the return result, either explicitly or implicitly (by ensuring that the result is
immediately dereferenced), and throw a NullPointerException
immediately if it receives null
.
Java (unlike Scala) can expose public fields without accessor methods. Java fields of
type RefType appear to Scala to be of (unboxed) type Option[RefType]
.
If they are marked as non-nullable, they are represented in Scala of being of type RefType. The Scala compiler must emit a null-check on attempts to read the field.
The Scala compiler shall mark all of its method parameters and return results for reference types as not-nullable, with annotations defined by JSR-305 (as soon as JSR-305 becomes finalised).
Other changes to the Scala library (except for Arrays) are expected to be minimal. In most
cases, Scala modules are already written using Option
instead of nullable types.
As outlined in the specification, above, Java methods may be annotated to declare that they do not accept or return, null references.
Unfortunately there is no common standard for such annotations, and much existing Java code is
not annotated in this way. However, annotations would greatly ease non-null Scala
integration with Java code (and avoid an explosion of Option
s when interoperating with Java).
We outline a general mechanism here for applying such annotations to existing, compiled, 3rd-party Java code (including the standard Java libraries).
There are several candidate annotation schemes for marking Java fields, method parameters and method results as non-null. The most prominent appear to be:
The general philosophy taken by this SIP is that as many kinds of non-null annotation as possible are accepted by the Scala compiler. The Scala compiler shall emit JSR-305 annotations—at least once that standard has been finalised. It could emit other annotations too, perhaps by a compiler flag.
The JetBrains Java IDE, IntelliJ IDEA, can detect nullability issues through static analysis of Java code, decorated with nullability annotations. The annotations are released under an Apache licence, and have been submitted to Sun for possibly inclusion in the standard Java SDK. The annotations are in the Java package org.jetbrains.annotations
.
@Nullable
null
. Assumed in lieu of any other annotation.@NotNull
null
.FindBugs is “a program which uses static analysis to look for bugs in Java code. It is free software, distributed under the terms of the Lesser GNU Public License.” It uses annotations to indicate nullability of fields, method parameters and method results. All annotations are in the Java package edu.umd.cs.findbugs.annotations
.
@Nullable
null
. Assumed in lieu of any other annotation. “In practice this annotation is useful only for overriding an overarching NonNull annotation.”@NonNull
null
.@DefaultAnnotation
@NonNull
, […]. In particular, you can use @DefaultAnnotation(NonNull.class)
on a class or package, and then use @Nullable
only on those parameters, methods or fields that you want to allow to be null
.@DefaultAnnotationForFields
@DefaultAnnotation
except it only applies to fields.@DefaultAnnotationForMethods
@DefaultAnnotation
except it only applies to method [result]s.@DefaultAnnotationForParameters
@DefaultAnnotation
except it only applies to method parameters.“This JSR will work to develop standard annotations (such as @NonNull
) that can be applied to Java programs to assist tools that detect software defects.”
The following annotations are applicable. They were interpreted from the current (as of 2 January 2009) SVN checkout of the JSR-305 source code. All annotations shown below are in the Java package javax.annotation
.
@Nullable
null
. Assumed in lieu of any other annotation.@Nonnull
null
.@ParametersAreNonnullByDefault
@ParametersAreNullableByDefault
@ParametersAreNonnullByDefault
).
This annotation implies the same ‘nullness’ as no annotation. However, it is different
than having no annotation, as it is inherited and it can override a @ParametersAreNonnullByDefault
annotation at an outer scope.
It does not seem a great stretch to suggest that most Java code is not annotated for nullability. Most importantly, the standard Java APIs are certainly not annotated. As an (important) part of this proposal, we outline a mechanism by which preexisting Java APIs may be annotated, without recompiling, and without requiring access to the source code.
The user (or Scala developers) can provide an ‘annotations’ file which overlays annotations onto an existing, compiled Java library. The annotations file is essentially minimal Scala code describing an entire package, without any method bodies. In addition, it need not duplicate anything which cannot be copied or inferred from the Java library itself.
It uses a ‘?’ prefix before type names to indicate that in Java they may be null, and in Scala they will be represented as an Option[T]
. (Additionally, see Appendix A.)
For example, an annotations file covering java.util
might look, in part, like this:
// Comments are allowed package java.util class BitSet { // These methods’ parameters not allowed to be null: and(BitSet): Unit andNot(BitSet): Unit //…etc } class Date { this(String) // Constructor does not accept nulls } class Calendar { fields: Array[Boolean] // Non-null protected fields isSet: Array[Boolean] } object Currency { // Parameter not null; result may be null. getInstance(Locale): ?Currency }
Note that:
import
statements. The compiler looks at the original Java bytecode to determine what the candidate types are. In cases of ambiguity—if a package uses path.package1.ClassName
as well as path.package2.ClassName
—the annotation file may use the fully qualified type name, or as much of end of the fully-qualified path as required to disambiguate—for example, package2.ClassName
.@beforeConstructor
(where the class
parameter is taken as being the class in which it appears) and
@calledByFinalize
. It is anticipated that this will be rarely used.It is anticipated that a Scala compiler would ship with annotation files for several of the core JDK packages, though perhaps not all of them. (There are a lot.)
In practice, the Scala developer will wish to write annotation files for lesser-used JDK libraries, third-party libraries (for example, Hibernate), and any internally developed, or proprietary libraries.
The Scala compiler will search for annotation files—named “packagename.scalax”—for all Java packages used, in the following locations:
Annotation files for the same Java package ‘stack’. So if the compiler ships with an annotation file for an older version of a particular Java package, the developer can augment this with a very limited annotation file which addresses only certain new classes and methods.
It may be necessary to develop rules whereby more specific, developer annotation files override more general, compiler-supplied ones. However, these rules are not specified here.
I had the chance during the last MVP summit in March 2007 to talk about non-nullable types with the C# team. […] Indeed, we agreed that something like 70% of references of C# programs are likely to end-up as non-nullable ones. —Patrick Smacchia, codebetter.com, 2007
Idiomatic Scala code is likely to benefit from the proposed changes without changing greatly—as it makes use of Option
in preference to null
. However, code which makes heavy use of Java libraries is likely to have to undergo significant repairs.
This blow is softened by several advantages to removing null
:
None
is sorta-kinda the same thing as null
. Also, the type system is now simpler.Option[RefType]
is as space-efficient as a conventional object reference.There is persuasive evidence to suggest that removing null
from a language
flushes out faults from existing code. Anecdotally, according to the creator of C#, “50% of the bugs that people run into today, coding with C# in our platform, and the same is true of Java for that matter, are probably null reference exceptions.”
Where analyses have been performed of use of null
in Java, it turns out that non-nullable references predominate, approximately 70%:30%. [REFERENCE FROM JAVA]
It is clear that this change would significantly break binary and source compatibility. It would also potentially break several existing assumptions about Scala language and library behaviour, not limited to the assumption of:
null
and the type Null
.length
elements.Option
.Option
being a subclass of AnyRef
.Option
having an initial value of null
.null
.It is clear that this is a significant change which, if it were to be made, would be best made before Scala had accumulated much ‘legacy’ code.
The clearest migration path would be one in which there were an intermediate form of the Scala language in which current constructs were allowed (but warnings flagged), while the new, non-nullable constructs were also allowed.
A version of the compiler enforces (or at least emits warnings to encourage) the new
instance variable initialisation rules. This is an essential part of allowing a language
without null
, and likely to cause the most (potentially awkward) impact to existing code.
The Array
class adds the clear
method (as described above), and its use is encouraged. However, other array semantics remain unchanged.
null
and None
A version of the compiler emits binaries in the new format (with unboxed Option
objects), but
Scala code is still allowed to manipulate nullable reference types. Array
has the new semantics
(except where noted below). Depending on a compiler flag, either:
null
to any Scala variables,
or if it passes null
to any Scala methods, (including storing null
in arrays);
Array.apply
may return null
;Option
objects; warnings are
emitted for any use of null
in Scala code. Array.apply
never returns null
,
(but throws NoSuchElementException
instead).In this step, depending upon a compiler flag:
Option
objects; warnings are
emitted for any use of null
in Scala code. Array.apply
never returns null
. (As per the second option in step 2.)null
is still a reserved word, but any use of it causes a compiler error.Though not essential to implementation of this SIP, there are 3 features which could make
working with Option
s simpler and less verbose:
Option
An alias for Option[Type]
would be ?Type
. Similarly Option[Option[Type]]
could be written ??Type
, and so on, for any depth of nested Option
s. If Option
is named
without a type parameter [is this even possible?], the ‘?’ notation cannot be used.
This would be only a syntactic shortcut. Option[T]
and ?T
could be used
interchangeably. That is, for a given type, T, Option[T]
≣?T
Note that this is the syntax used for nullable types in the language Nice.
It is also similar to the syntax used for nullable value types by C#, and nullable types in JavaFX Script (which place the question mark after the type name: T?
).
Removing null
from the Scala language means that, potentially, more class instance variables
must be assigned values, whereas before they could be left to the default, value, null
.
It could be convenient to allow classes to specify, through their companion object, a default value. For example, imagine that the programmer writes the following code:
var someList: List[String]
Under the new requirements that all instance variables be assigned a value, this would
be disallowed. In this example, an appropriate default value would be Nil
, and it could be convenient to have the compiler supply this automatically. The compiler could check the companion object of the type for a no-argument method
called ‘apply’. In the example above, the code would be rewritten by the compiler as:
var someList: List[String] = List()
Assuming that the types could be inferred and result type matched (in this example, ‘apply’ would need to return a value compatible with List[String]), the code would compile.
Most default values would be simple immutable constants and the call might be inlined by the compiler:
var someList: List[String] = Nil
Note that arrays of reference types would not be initialised by default, both for performance reasons, and to avoid changing array semantics too much.
As discussed above, an overloaded Scala would now be unable to differ only in
whether the type of a parameter is an Option[RefType]
as opposed to a naked RefType. In other
words, a class may not have two methods meth(s: String): Unit
and
meth(s: Option[String]): Unit
.
We might relax this restriction by mangling the names of methods with RefType parameters.
The most general method (from the point of view of a Java program wishing to call it),
would have an Option[RefType]
in the place of every RefType, as this would allow a
Java caller to call it with any value, including null
. The most restrictive method
would not allow Java null
, and hence would be declared with (non-nullable) reference
types.
Where two methods are identical in respect of JVM parameter types, but differ in Scala type,
each parameter which is either a RefType, or an Option[RefType]
is numbered r0 to
rn. Assume that ‘isOption’ is a function within the compiler which returns 1 if the
given numbered parameter is of type Option[RefType]
, and 0 if it is of type RefType. The method
is then assigned an unsigned integer number, rank, which is formed by taking
isOption(r0) as the most-significant bit, isOption(r1) as the
next-most-significant bit, up to isOption(rn) as the least significant bit. We
rename the method as “originalMethodName$rank” in the generated bytecode.
We add a method with the original, unmangled name, as an alias to the lowest-ranked (most
general) method.
this
reference escape during construction”, IBM DeveloperWorks, Brian Goetz, 2002Copyright © 2008, Andrew Forrest