6/11/2005 4:30:00 AM


Microsoft Research previously made public two experimental extensions of C# , Spec# and COmega (formerly polyphonic C and Xen). They are research projects, so they may not necessarily enter the language, but the C# team may decide to incorporate the new extensions.

One of the sessions preannounced at PDC 2005 is called “Deep Integration of Relational Data and XML in Your .NET Language.”  This indicates strongly that we will see parts of COmega (aka, specifically, the data extensions formally known as Xen) incorporated into the C# 3.0. These data extensions include inline support of XML and SQL inside the programming language. It’s also implied that the next VB will support similar extensions. (If so, VB will also likely incorporate anonymous methods and iterators as both play a major role in COmega. COmega may well have been the impetus behind the two C# 2.0 features.)

Depending on how well C# and VB integrates these features, these additions may cause Microsoft to be perceived as a true innovator in the language space, since these features aren't found in other languages. Microsoft also risks the perception  that the XML and SQL language will be improperly integrated into the programming language (eg, Perl)—just as browser technology had been grafted into the Windows 98 user interface.

I do think that that these extensions, designed by researchers, are well-though  out and have a strong theoretical foundation. Anders Hejlsberg spoke in an interview that the language designers were trying to capture the full expressiveness of relational algebra in the C# language rather than simply providing type-safe syntactic sugar for writing SQL queries. In other words, the new “select” statements work equally well on C# objects and the new XML data structures. The new XPath-like generalized member access may likewise work well on C# objects and relational data. It seems clear that the C# designers inherited C++ philosophy of introducing a general, orthogonal features.

C-Omega is a research project that explores language extensions in a couple domains. One is database integration, the other is XML. And C-Omega was effectively sort of conceived as, let's take C#, let's try and take SQL, let's try and take XML or XQuery and let's all sort of put them into one big bowl and stir and see what comes out of it, experimenting with integrated queries and so forth ...

And we've learned a lot from that prototype, and we are now working to apply a lot of that knowledge in C# and our other programming languages. ...

So what we're looking at is really trying to much more deeply integrate the capabilities of query languages and data into the C# programming language. And I don't specifically mean SQL, and I emphatically don't mean just take SQL and slap it into C# and have SQL in there. But rather try to understand what is it expressively that you can do in SQL and add those same capabilities to C#.

I was watching this video of a presentation of COmega by Gavin Bierman, researcher in Microsoft Research Cambridge. I learned more from this presentation than from his papers, despite the presentation having less information.

The researchers’ background in functional languages shows in COmega. I left with the impression that the C# 2.0 features of iterators, nullable types, and anonymous method were added primarily to support COmega, especially since C# 3.0 was developed simultaneously with C# 2.0.

The target domain for COmega is distributed Web applications, which consists of three tiers: The data services tier (relational SQL), the middle tier (C# & objects), and the user interface tier (XML/HTML).  Even though the C# and VB languages are “modern,” support for the data in the first and last tiers are weak. Data access between C# and those tiers is weakly typed and string-based. With stronger supported for data access, Gavin says that languages can provide “better level of abstraction, make invariants and intentions more apparent, give stronger compile-time guarantees, enable different implements and optimizations, and expose structure.” He does forget the typed support available in the form of object-relational mapping and XSD tools.

COmega introduces several notions besides inline XML and SQL.

  • Streams. These are essentially IEnumerable<T> objects. There probably will be a shorthand notation introduced, but it’s unlikely to involve the pointer notation that COmega currently uses. Streams are returned by iterators, the generalized member access mechanism, and the new select statements.
    IEnumerable<string> data = select Customers from Database;
  • Anonymous structs. This is similar to tuples, except that each element within the struct can be named. (Anonymous classes in Java and anonymous methods in C# show that their is major productivity benefit in not having to name constructs in the language. In C, each struct created required a new variable declaration, whereas temporary structs are created simply in C# using the new operator.)
  • Choices. Choices are conceptually similar to anonymous unions. I am betting the C# team sticks with the “union” keyword rather than “choice”.
  • Content Classes. These are classes that consist of a single anonymous struct. Inline XML elements are created from content classes, though I don’t really see why that should exclusively be the case as XAML works with regular objects.You can write embedded XML like,
    book xml = <book><title>Macbeth</title><author>Shakespeare</author></book>.
  • Generalized member access. Streams support simplified XPath-like member access to methods of its elements.
    • stringStream.ToUpper() returns a new closure representing a stream of uppercase strings.
    • Alternatively, the same code could be written with stringStream.{ return it.ToUpper(); }

COmega also includes concurrency features in the form of chords from join calculus, but there haven’t been indications that these concurrency features will also be introduced into the language. In addition, there has been no mention of Spec# support as well.






Net Undocumented is a blog about the internals of .NET including Xamarin implementations. Other topics include managed and web languages (C#, C++, Javascript), computer science theory, software engineering and software entrepreneurship.

Social Media