Deterministic Finalization

10/5/2003 6:30:53 PM

Deterministic Finalization

One of the features that developers have missed moving to .NET framework is deterministic finalization. If you followed some of the internet discussion and even some of the comments from Microsoft developers, you would even be lead to believe that deterministic finalization within the .NET framework is impossible.

Well, I have long suspected that deterministic finalization was not incompatible with the implementation of garbage collection in .NET. If you closely at the new Whidbey feature list for Managed C++ in TLS310, you will find that Managed C++ gains a new feature—deterministic finalization.

In fact, the concept of deterministic finalization (object cleanup, whether it be destruction or finalization, upon last use) is independent from concept of garbage collection. The confusion occurs because .NET garbage collections will call the Finalize method of any object immediately prior to being deleted, and, in the previous unmanaged world, destructors, precursors to finalizers, were called right before deletion, either as the result of an explicit delete call or the unwinding of the stack at function exit. Another cause of confusion is the belief that only reference counting can provide deterministic finalization for heap-based objects. This has led to at least one project to add reference-counting to .NET.

(Reference counting has other problems—lack of thread safety without the use of costly locks, inability to resolve cyclic references, complicated exception handling. Reference counting still requires an explicit null or other assignment to remove a dead reference. Also, in comparison to .NET garbage collection, reference counting forces all objects containing references to have destructors, whereas garbage collection removes the traditional and most common need for destructors in the first place—which was to free memory; the main use now of .NET finalizers has shifted towards freeing unmanaged resources.)

If we decouple the conception of deterministic finalization from object deletion (or deallocation), it becomes easy to see that deterministic finalization is indeed possible in .NET and not any more difficult to implement than in C++. Finalization only needs to occur sometime before an object is deallocated, preferably after last use; it does not have to occur right before deallocation. We just need to accept that the object may remain allocated long after it is finalized and no longer referenced. With this decoupling, it becomes clear that this is a problem for compilers, which have more information about object lifetimes, to solve, not the runtime.

If you take this definition of deterministic finalization, you will see that C# already has the feature available through the using keyword. The keyword is actually syntactical sugar for implementing the .NET Dispose pattern, complete with exception handling. The .NET Dispose pattern is actually the official method for implemented deterministic finalization, but, what developers really want, is automatic compiler detection of last use and generation of the appropriate calls to Dispose method. In this way, the code required is eliminated and chances are programmer errors (or omissions) are reduced to zero.

Implementing deterministic finalization for value types is fairly straightforward. (This may already be available in Managed C++ now). What about heap-based objects? Just like with C++, you would use the delete keyword to call the object’s finalizer, because any heap-based object in any programming language has an unknown lifetime. Calling delete will actually call the finalizer in current versions of Managed C++.

But let’s go further and ask, is there a way to treat heap-based objects like stack object? In Managed C++, all types to class objects use the pointer notation, so in order for a variable to refer to a string, the user must the type String * instead of the String type used in C#. While this pointer can seem redundant now, one possible implementation of deterministic finalization in MC++ could be the interpretation of a pointer-less String type or other reference class object as a stack-like object that could automatically be destructed, much like a value-type. This is just like the usage of using keyword with the entire function as the using block, but without the verbose syntax. Such a stack-based variable could contain a pointer to the actually heap object, but actually exhibit value semantics, such as invoking the copy-constructor when passed through a function and assignment-constructor, when being assigned to another variable. Objects could be passed by reference by using String & reference notation. I don’t know what the C++ guys are up to, but this seems like a very natural extension to Managed C++ to support deterministic finalization.

In C#, we could clean up the using syntax by offering a new auto storage-class modifier, borrowed from C++, since C# does not have use the pointer notation. (auto is not very commonly used, because it essentially acts as a non-operation whenever it is used.) The “auto” keyword would indicate that object should be disposed upon leaving the current scope. In addition, we could have auto classes, which would have value semantics by default; the differences between classes and valuetypes are much less than in C++. This would be interesting addition to C#, but, most likely, it would add simplexity (a simple feature with complex ramifications) to the language.

I've heard that STL has a similar feature to what I described above, where you can declare an variable of type auto_ptr, which is a zero-overhead class that contains a pointer and automatically frees it at the end of the scope its declare.

Comments

 

Navigation

Categories

About

Net Undocumented is a blog about the internals of .NET including Xamarin implementations. Other topics include managed and web languages (C#, C++, Javascript), computer science theory, software engineering and software entrepreneurship.

Social Media