It’s a truth widely acknowledged that null references in programming languages lead to faults. Newer languages, such as Apple’s Swift are controlling nulls, to avoid problems.
The .NET and C# design committees are looking at introducing non-nullable reference types in a future version of C#. This post outlines what I think are the requirements of (non-)nullable references in C#. A future post will describe a possible language design which fulfils them.
Background
Existing languages, such as C# and Java, have a legacy of allowing any reference type to be null. This has several implications:
- Null is used by programmers to mark a ‘missing value’. However,
- There’s nothing in the type system to document whether a value can potentially be null or not, so:
- Programmers are required to document nullability in code comments, and enforce rules and conventions explicitly in code (which is a source of errors).
- The language runtime must check for a null reference before every object dereference (which is inefficient).
.NET 2.0 introduced the idea of nullable value types. This allows ints, doubles, bools, and other passed-by-value structures, to be marked as optional. C# uses the same terminology of ‘nullability’ to express this but ‘nullable’ value types are significantly different from nullable reference types:
- Nullable values have a different type from non-nullable values; an optional int has the type ‘
Nullable<int>
’, which is usually abbreviated to ‘int?
’. Nullable value types are actually structures which contain a value type (or don’t, in the case of ‘null
’).
- The type system does not allow a null to be assigned to a non-nullable value type.
- Nullable numeric types can be used without checking for null. For example, ‘nullableA + nonNullableB’ has a well-defined result, and does not throw a
NullReferenceException
. Nulls propagate through maths expressions, similarly to NULL values in SQL.
Requirements
It would be undeniably useful to allow reference types which exclude null. Such types might be annotated with an ‘!
’ after the type name.
This could help guarantee that the software never faults with a NullReferenceException
. It also makes something explicit in the language which is currently implicit, thus removing a documentation and coding burden. For example, annotating a method parameter as non-nullable would mean that there is no need to document that nulls are disallowed, and no need for the method to test ‘if (p == null) throw new ArgumentNullException("p");
’ on its first line.
However, that’s only a partial improvement on what we have now. There is an opportunity to also introduce:
- Explicitly nullable reference types with the same semantics as nullable value types (i.e., you’d be able to declare a parameter as ‘string?’)
- Type uniformity between explicitly nullable value and reference types. That is to say: the type ‘
System.Nullable<T>
’ would be able to accept any T regardless of whether T is a reference type or a value type.
Why?
Explicitly-nullable reference types may require more explanation: after all, they seem not to add anything new to the language. Actually they add several things:
- Safety: Explicitly-nullable reference types require explicit testing and dereferencing. ‘
if (n.HasValue) DoSomethingWith(n.Value)
’. This removes ambiguity and one source of errors.
- Intent: Marking a parameter as explicitly nullable is a stronger indication of intent than an unannotated reference-type parameter, one which would default to allowing null.
- Uniformity: There is more uniformity between reference and value types (of which, more below).
- Potential: We raise the possibility of changing the default in future versions of the language, from reference types being nullable by default, to them being non-nullable by default.
What about type uniformity between nullable values and references? What does that add?
Uniformity simply makes writing type-generic code easier. It allows us to define, for example, an interface ‘IParser<T>
’ with method ‘T? Parse(string str)
’—regardless of whether T is a reference type or a value type.
In many cases the programmer should not have to care whether a type has value semantics or reference semantics (particularly if the type is immutable). By enforcing reference-value uniformity in nullable types, we remove a sometimes-artificial distinction between value types and reference types.
[Note that being able to use ‘Nullable<T>
’ for any type T implies that it would be possible to declare ‘Nullable<Nullable<T>>
’. This would rarely be required, but again, for reasons of uniformity, it would be a welcome addition to the language. At this point ‘Nullable<T>
’ is a misleading name, and ‘Optional<T>
’ would be a better name, but I suspect that that ship has already sailed.]
Interoperating with old .NET libraries
There is one other important requirement: interoperability.
It almost goes without saying that a new language feature must interoperate cleanly with legacy code, and not break source or binary compatibility. At a minimum:
- New, null-aware code must be able to pass parameters to non-null-aware methods in other libraries. This implies that there must be a way to pass (for example) a ‘
string!
’ or a ‘string?
’ to a method which accepts an (implicitly-nullable) ‘string’. Similarly it must be possible to accept a non-null-aware reference result, and somehow massage it into a null-aware-reference form.
- Old, non-null-aware code should be able to call code written with the new null-aware style. This implies that old code which attempts to pass a ‘string’ to a method which accepts a ‘
string!
’ may have an ‘ArgumentNullException
’ thrown by the runtime system if it attempts to pass in a null
value. Similarly, old code must be able to accept a null-aware reference type result and use it as it would a non-null-aware result.
- It would ideally be possible to gradually upgrade old code by adding ‘
?
’ and ‘!
’ annotations to method parameters and results, without breaking source or binary compatibility—even overriding base class non-null-aware methods with null-aware ones.
Summary
These are my requirements. I’ll follow with another post soon, about a design for nullability in C# and .NET which fulfils these requirements.