Dynamic Typing in C#
The next iteration of C# is poised to become multi-paradigmatic, addressing numerous issues in programming. Most discussions focused on SQL and XML data integration and concurrency, but new features mentioned by a journal submission suggest an assault on dynamic languages is in preparation.
Eric Meijer and Peter Drayton recently submitted “Static Typing Where Possible, Dynamic Typing When Needed: The End of the Cold War Between Programming Languages” for a journal on “Revival of Dynamic Languages.” Both authors worked at Microsoft on research projects experimenting with new language extensions on top of C#. Their article explains how dynamic features can be retrofitted cleanly into statically typed language using examples based on C# extensions. Most of these have been published previously in articles on COmega and Spec#. The effect is pseudo-dynamic typing—dynamic style programming on top of a statically typed language. The introduction of pseudo-dynamic typing in C# raises the possibility that it may become pervasive as a scripting language, stealing much thunder away from Python, Perl and Ruby.
Is it a coincidence that the timing of this publication is so close to the upcoming PDC 2005 announcement of C# 3.0? I think not, yet at the same time I am not sure all of the features mentioned will make it in. C# 2.0, which introduced four major features, is essentially a light release in preparation for the mega-release C# 3.0, which was co-developed at the same time. (Whidbey was originally the version 1.2.) Chris Brumme, CLR architect, previously mentioned that Whidbey release was focused on product maturity, focusing on fundamentals such as performance, reliability, and security, but that the following Orcas release would embark on a lot of crazy new ideas, such as those from functional and dynamic programming languages. As an indication of this, Jim Huginin, author of IronPython, a .NET dynamic language, was recently hired into the CLR team. I’m just glad that Orcas will be such a feature-filled release, because the next version, Hawaii, will probably not arrive until 2009 or 2010.
I outlined below some of the major solutions that the paper mentioned. Some of these solutions have already been incorporated in C# or introduced in one of the research languages Spec# and COmega; in those case, I included them information within square bracket.
- Type Inference. Type inferencing is already offered in a number of languages like Haskell. Currently, type information must be specified explicitly in the declaration. With type inferencing, a variable is assigned the most general type that will successfully compile within a block. It requires an advanced AI technique called unification, already used to some degree in generic type inference in C# 2.0. Eric noted that static typing with type inferencing could result in less verbose code than dynamic typing: Within object literals, for example, constructors can be invoked without specifying the type to construct.
- Contracts. [Spec#] Contracts extends the type system to support invariants, preconditions, postconditions. Contracts allow stronger compile and run-time checks. It also opens up the possibility of long-running theorem-proving tools to examine the IL post-compilation and establish program correctness.
- Coercive subtyping
- typelifting across collections, null types, discriminated unions and tuples. [COmega] This is essentially generalized member access, which I mentioned in an earlier post. I liked to see a systematic, orthogonal approach to enable lifting over an type, not just the specific ones mentioned above.
- late-binding. [VB] C# may support late-binding for object variables in the same fashion VB does today.
- (patterns). This isn’t in the paper, but a C# developer mentioned possibility that generic types may support patterns, in addition to interfaces, to allow calling member functions by “name” like C++ templates do. I suspect that, if this is implemented, patterns will be implemented as interfaces, which are dynamically painted at runtime unto an object rather the explicitly implemented at compile-time.
- Dynamic scoping. A statically typed compiler emulates dynamic scoping by passing in hidden arguments into a function that uses a dynamically scoped variable. Any function that calls dynamic scoped functions with an implicit argument will also required a hidden implicit argument as well.
- Covariance and Contravariance in Generics. [IL 2.0] Covariant and contravariant generc type parameters are already supported in IL in Whidbey, but have not yet been integrated into any of the mainstream languages.
- Adhoc relationships and prototype inheritance
- Expando properties. C# could offer special syntax support for untyped objects, essentially variables of type Dictionary<string, object>, essentially mimicking the behavior of Perl and Python. This support would also required reflection through the IExpando interface.
- External link tables. Link tables could help C# bridge the mismatch between relational keys and object references.
- Lazy evaluation. Erik mentions streams of objects, which I think is partially addressed by iterators in C# 2.0, with more advanced support coming in Orcas.
- Eval. The article also looked at eval capabilities of dynamic languages. Some of the more common uses of eval can be solved through closures or standard methods for deserializing data. The more advanced uses (partial evaluation, multi-stage programming and meta-programming) would rely on support for code literals (programs within programs). No specific details were provided on this point, so it’s the least likely to be address within Orcas.
Will these features carry over to Visual Basic? Perhaps, but recent statements by various designers in the C# and Visual Basic team suggest a growing divergence in language approaches, reflecting differences in the underlying philosophies of each. While both VB and C# are adding data features to the language, it appears that each will take a separate approach mirroring their respective Mort and Elvis personas. These emerging differences will force developers in the future to reconsider the substitutability of C# and VB and take sides depending on the priorities of their application development—rapid development or code quality. (Microsoft maintains that VB, C# and C++ emphasize RAD, language innovation, and power, respectively.)
Visual Basic is all about rapid application development. It more focused on accessibility and productivity and is probably more likely to tradeoff runtime performance for the immediacy of quick background compilation and interpreter-like interactivity. This is because VB applications are more likely to be ad-hoc, internal business applications that need to be churned out quickly and inexpensively. These applications are more likely to rely on off-the-shelf components to speed development. Since they are used in a fixed manner in a fixed environment, these applications have limited testing needs.
Development in C/C++ is usually too costly for routine IT work; those languages are more cost-effective for applications developed for sale. As a mostly psychological descendent of C family of languages, C# is more focused on software engineering, and hence emphasizes explicit code and error detection at compile-time at the expense of programmer interactivity; it also competes with C++/CLI in offering easier access to the platform though unmanaged pointers—important for managing memory-mapped files, processing image and manipulating low-level system data structures. C# is used more often in commercial applications and libraries with large external customers. The higher cost in testing makes it more worthwhile for C# to trade off compilation time for time-consuming code analysis and inferencing
These aren’t hard guidelines, but reflect the different emphasis each language takes. Some good examples are new features offered to reduce code: New code-reduction features in Visual Basic seem intended to promote accessibility such as the My classes, Handles and WithEvents. In contrast, new code-reduction features in C# involve more advanced concepts (closures and iterators) and serve to eliminate the programmer from entering tedius, error-prone lines.