Nullable reference types in C#, generics

I’ve described some design requirements for implementing non-nullable and explicitly-nullable reference types in C#, and a design which meets those requirements.

However, there are two major items I’ve not yet discussed: how these null-aware types interact with .NET generic types, and how they interact with legacy code containing implicitly-null reference types.

In this episode, Generics:

Generic types (and methods) are in a slightly different category than the kind of code we’ve considered so far. Up until now we’ve been talking about adding constraints to our own types in our own code to disallow nulls or control them. We can update the types within our own code as we like. There is even the possibility of converting between null-aware and implicitly-null reference types at the boundary of old and new code.

With generic code, it’s slightly more complicated: some of the types which generic code manipulates are specified by client code. We should write generic code to be robust, and avoid NullReferenceExceptions, but our client code may want to supply implicitly nullable reference types as type parameters. (Or not.)

And from the point of view of using generic code with non-nullable types, existing generic code may be written such that it expects all types to have a default value. If it restricts a parameter to be a ‘class’, it probably expects to be able to make use of null… These and other assumptions are broken if non-nullable reference types are passed as type parameters to existing generic code.

New constraints on type parameters

In order to allow generic code to peacefully coexist with null-aware code—and to protect legacy code from null-aware code breaking its assumptions, we add a few new type constraints to generic type parameters:

  • default and ~default declare whether or not a type parameter needs to support a default value;
  • ! applied to a type parameter denotes that its type values will always be coerced to null-aware types.

default and ~default

For generic code to be allowed to call default(T), (where T is one of its type parameters), T must have a default value. So we introduce a new type parameter constraint ‘default’, to require this:

public class Fooo<T> where T: default {
   public void Brrrr() {
      return Something() ?? default(T);
   }
}

The trouble is, in previous versions of C#, every type had a default value, so there was never a need to declare it as a type constraint. No existing generic code declares ‘default’ constraints for type parameters.

For compatibility with existing code, ‘default’ will be assumed by the compiler for type parameters (except for type parameters on interfaces). We introduce its antithesis, the keyword ‘~default’ to denote that a type parameter does not assume a default value. For example:

public class List<T> where T: ~default {
   // Great! We can now store non-nullable ref types in generic Lists
   …
   // We’ll deal with this tricky bugger later:
   public T Find(Predicate<T> match) { … }
}

Just to be clear, that ‘~default’ is not prohibiting type T from having a default value, just declaring that it doesn’t need to have one. It’s removing a constraint, not adding one.

It means “We welcome null-aware code here! We’re happy to accept a non-nullable type parameter.” It allows the type parameter to be of any type whatsoever (subject to other type constraints of course).

Value types always have default values, so if a type parameter has a constraint of ‘struct’, the ‘default’ is implied and need not be specified.

The compiler can be set to issue warnings for generic type parameters which have not been unambiguously declared as either default nor ~default (except where other constraints imply one or the other).

Obviously for ‘class’ type parameters, ‘default’ generally implies ‘nullable’.

Interfaces

Type parameters on interfaces are implicitly ‘~default’, unless declared otherwise. The reason is that we do not need to constrain implementations of the interface one way or the other. Type parameters in implementations of interfaces are still constrained to ‘default’ unless specified otherwise.

Null-aware coercion

The programmer can also declare that a generic type parameter must be null-aware, with the following syntax:

public struct Nullable<T!> { … }

Null-aware coercion is not expressed as a type constraint, but that is intentional: it doesn’t constrain the type parameter so much as modify it.

When T is already null-aware it has no effect.

When T is an implicitly-nullable reference type, it’s turned into T! Another way of looking at it: it’s as if every reference to the type name T throughout the class (or method, or interface) is replaced with ‘T!’.

If a type parameter is coerced in this way, ‘~default’ is assumed by the compiler for that parameter, (unless the programmer explicitly specifies otherwise).

Some examples

Here are some examples of how the default constraint and null-aware coercion interact:

// T1 is implicitly ~default, and coerced to a null-aware type:
public class Generic1<T1!> { … }

// T2 must be an explicitly-nullable or a value type (or have a default value).
public class Generic2<T2!> where T2: default { … }

// T3 must be an implicitly-nullable reference type (or have a default value):
public class Generic3<T3> where T3: class, default { … }

// A non-nullable reference type with a default? Only if we allow default
// reference types, as per the previous article. It’s an unlikely combination:
public class Generic4<T4!> where T4: class, default { … }

In reality, type-generic programs should usually avoid ‘default’, because ‘default(T)’ is usually used to mean ‘no value’, and that’s much better represented by an explicitly-nullable result or parameter. (And the compiler should almost certainly produce a warning for generic code which constrains any type parameters to ‘class, default’.)

Respecting the garbage collector

When you’re finished with a (reference) object, you should dispose of that reference, otherwise the garbage collector will not be able to reclaim it.

With implicitly nullable types, just assign null to it. With explicitly-nullable reference types, just assign null to it. For value types (if you need to), you can assign default(T) to it. However, if you don’t know the exact type (because you’re writing type-generic code), it’s trickier to ‘forget’ a value (which may be an object reference).

Boxes

One way to hold on to a value (possibly a reference), and ‘unset’ it easily, without a default being available, and without possibly incurring additional memory overhead— (for struct types, T? is larger than T) —is with a ‘box’:

public struct Box<V> where V: ~default {
   private readonly V _value; // By default: default(V), or ‘unassigned’
   public Box(V v) { _value = v; }
   public V Value { get { return _value; } } // Could throw FieldUnassignedEx
}

// Value we wish to be able to ‘unset’ without knowing its exact type:
var Box<T> box; 

box = new Box<T>(newValue); // Set the value
var v = box.Value; // Get the value
box = default(Box<T>); // Unset the value
var v2 = box.Value; // Will throw FieldUnsetException if no default

You can use Boxes to ask for the default value of a type—in a way which will always compile (no matter the type), but will fail at runtime if the type does not have a default. We’ll use this trick later to maintain source & binary compatibility with old code:

// May fail at runtime:
var defaultIfAvailable = default(Box<T>).Value;

Arrays

If you’re building a generic collection class (List<T>, for example), you might well use an array for storage. If you allow elements to be removed from your collection, you are faced with disposing of array elements somehow, in case your collection is being used to store references, and these references need to be garbage-collected. With array elements, as with single values, it’s easy enough to dispose of references if you know the type, but in generic code it’s tricker.

You could choose an implementation where the elements are stored in an array of Box<T>. To ‘forget’ an element, you would assign the value default(Box<T>) to its array slot. (This technique has other advantages, such as avoiding the implicit check for ArrayTypeMismatchException by avoiding C# array covariance.)

However, to avoid the need to implement (slightly-obscure) tricks, we redefine Array.Clear():

  • For arrays of types with default values, it behaves as it does now, and sets the elements in the given range to that default value.
  • For arrays of types without defaults, it makes the given range of array elements ‘unassigned’, the same state they were in when the array was allocated originally.

Some compatibility concessions

Default default parameters

We do allow one use of the default() operator on ‘~default’ generic code type parameters: as the default value of optional parameters. For example:

public class MemoryCell<T> where T: ~default {
   // Clients can only take advantage of this default parameter when T
   // has a default value, otherwise they need to specify it explicitly:
   public MemoryCell(T initialValue = default(T)) { … }
}

public class ClientCode {
   void Method1() {
      //var m1 = new MemoryCell<FileInfo!>(); // Disallowed default parameter
      var m2 = new MemoryCell<FileInfo!>(new FileInfo(Path)); // Explicit: OK
      var m3 = new MemoryCell<int>(); // OK because default(int) is defined.
   }
   void Method2<U!>() where U: new() {
      //var m4 = new MemoryCell<U>(); // Disallowed default parameter
      var m5 = new MemoryCell<U>(new U()); // Explicit: OK
      var m6 = new MemoryCell<U?>(); // Also OK, because default(U?) is defined
   }
}

This rule exists only to conveniently maintain source compatibility with old clients of generic code. It will be flagged as a warning by the compiler, because it’s not the sort of thing we’d like to encourage.

Living without default()

As per the example above where we showed a null-aware version of the standard .NET collection class List<T>:

public class List<T> where T: ~default {
   // Great! We can now store non-nullable ref types in generic Lists
   …
   // Problematic, since it’s supposed to return default(T) if no match:
   public T Find(Predicate<T> match) { … }
}

The rest of the class is fine, but two methods make use of the default() operator to return an ‘unknown’ result: Find() and FindLast(). These methods could have been defined to throw an exception if they cannot find a match. And with explicitly-nullable reference types, they could be redefined to return T?… but that would break backwards compatibility.

We can work around this, using the trick with Boxes to get the default value without using the ‘default’ keyword:

public class List<T> where T: ~default {
   [Obsolete("Use TryFind instead")]
   public T Find(Predicate<T>! match) {
      var d = DefaultOrNotSupported("Find");
      return TryFind(match) ?? d;
   }

   // A new, more useful Find method. New code should use this:
   public T? TryFind(Predicate<T> match) {
      // Return an element from the list or (T?)null.
   }

   private T DefaultOrNotSupported(string! methodName) {
      try {
         // The default value of T, or a FieldUnassignedException.
         return default(Box<T>).Value;
      } catch (FieldUnassignedException) {
         throw new NotSupportedException(
            string.Format(
               "{1} method not supported for types without a default value",
               methodName));
      }
   }
}

Newer code should use the newer TryFind() method, whose result is always ‘T?’, regardless of what ‘T’ is.

Other uses of default() operator

Looking at the source code of the current .NET List<T> implementation, there are a couple of other uses of default(T):

  • Within the list’s Enumerator struct, to reset the ‘current element’ variable for the List’s IEnumerator. To fix this, the ‘current element’ instance variable can instead be stored as a Box<T> and the ‘Current’ property changed to return ‘current.Value’. Legacy code (or any code using types with default values) will behave exactly as before. If the type does not have a default value, calling ‘Current’ when its value is undefined will throw a FieldUnassignedException. This is acceptable since the behaviour in that case is undefined, and only newer code will be affected.
  • default() is also used in various methods which take ‘object’ parameters, to test whether the type accepts nulls: “if ((object)default(T) == null)”. This would be better changed to a new property on the Box class “if (default(Box<T>).IsNull)”… which ideally the runtime can optimise away.

If you’ve read this far, well done.

Generic types are necessarily more complex than pretty much any other area of the language.

In my next (and penultimate) article I’m going to look at some remaining details of compatibility with existing code. The last article will sum up and compare this approach with some of the other approaches being discussed in the C# 7 feature proposals.

4 thoughts on “Nullable reference types in C#, generics

  1. Pingback: Nullable reference types in C#: backwards compatability | Andrew’s Mental Dribbling

  2. Pingback: Nullable reference types in C#, a design | Andrew’s Mental Dribbling

  3. Pingback: Nullable reference types in C#, a digression | Andrew’s Mental Dribbling

  4. Pingback: Non-nullable reference types in C# & .NET | Andrew’s Mental Dribbling

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.