Nullables Design Change
Somasegar reports on a major design change made to the Nullable data type in the CLR in “Nulls Not Missing Anymore” in which nullables have acquired special runtime status. For example, a Nullable is now boxed to a reference wrapper around the underlying type or a null reference if it has no value. This change restores the continuity between nullable value types and reference types.
To understand the change, it’s helpful to view the different alternative implementations.
Nullable Types as Typed Boxed Value Types
In C#, a value can be boxed, but the underlying value of the new heap object cannot be access without unboxing or modified at all except through interfaces or reflection. Without interfaces or reflection, boxed values are essentially immutable and opaque; no methods, fields or properties can be called.
In contrast, C++ does allow value types to be referenced directly on the heap and have their methods called. In managed C++, interior pointers could point to the underlying value of boxed object in a typed way. In C++/CLI, one simply holds a handle (int^) to the value type.
The initial proposal for C# to store nullable types on the heap as typed wrappers on the heap. In this case, the question mark syntax, (eg, int?) could be thought of as a managed pointer type. This proposal was shot down because of performance oncerns from customers according to Cyrus Blather.
Nullable Types as Generic Value Types
The actual implementation of nullable types used a generic struct Nullable<T> containing the underlying value plus a boolean to indicate null status. In C#, the null keyword acquired additional behavior when used with nullable types to check for and assign a “null” value.
The performance of this approach was close to that of nonnullable value types. However, since nullable types are not integrated into the runtime, the new language extensions broke down when nullable values were boxed or used from within generic methods.
Any good language designer could smell the stench of bad design and realize that this defect was likely to have major ramifications and additional design reviews in the future.
One reviewer “Dr. Pizza” criticized it in a C# developer’s blog:
How about get rid of the nullable types abomination and start over from scratch. How that abominable "feature" ever made it into the finished product is beyond me. It should have been shouted down as a retarded piece of idiocy right from the start. It's like they actively set out to avoid a consistent orthogonal design.
I started out vehemently angry with Nullable. Then i saw how people were using it and how it made thigns so much easier for people, and i'm much happier with how it's going to ship.
I also feel that it will integrate well with the future of C# and that having it in place now so that people can start using it in thier apps/systems is essential.
My own initial reaction to the news of nullable types was of concern. A negative gut response is usually bad sign. Great designs like iterators tend to be instantly recognizable.
Nullable types initially didn’t appear to add to the language and was best left to the framework. My concern eased when I read the spec and learned more about nullable support for operator lifting, coalescing, and ternary booleans. Coalescing works, for example, across both reference and nullable types. It also appeared to integrate naturally with sequences in COmega and constraints in Spec#—both potentially new features in future versions of C#.
However, the design flaws that quickly emerged left me second-guessing. Some of the enhancements like operator lifting are better handled by including more general support for all data types.
Nullable Types as Intrinsic Generic Value Type
The new implementation preserves more of the continuity of the first approach while retaining the performance of the second approach by special-casing the boxing and other operations.
There are still differences between the new approach and the “typed reference” approach.
- Nullable types are immutable.
- Nullable types exhibit value semantics, rather than than reference semantics. Values are always copied across function call boundaries.
- Methods, fields and properties of the underlying value in a nullable type cannot be called directly.
However, these problems are not serious, because it is easy to create a new generic reference type called Box<T> that takes a value type parameter, T, and removes these differences.
I applaud this design change and believe it to be superior to the first two approaches.