Static Analysis

11/20/2005 10:59:37 AM

Static Analysis

I mentioned earlier in my post “Get Your Butt Outta Bed and Build Something” that I would be shipping subproducts before my main product.

My main product (which shall remain secret) is general-purpose desktop application and may attract an audience that requires more hand-holding and is less tolerant of inevitable bugs from a first version release. Quality fears due to size of the product is currently driving my decision to ship different portions of my codebase in smaller subproducts earlier.

This is my current plan:

In my first subproduct, a static analysis tool, I will focus on testing my AI backend. Subsequent subproducts will focus on my other technology such as natural language processing (e.g., kickass grammar checker) and document-editing (e.g., graphical code editors). These subproducts will each have short release cycles; I will release a free lite version and commercial pro version. In contrast to my main product, the subproducts will be targeted to developers, experienced computer users and software businesses, who require less hand-holding and of whom I am also representative. After I have tested and released the various components of my technology, I will incorporate the feedback into my main product.

As I mentioned earlier, I am working on a Static Analysis tool for .NET languages comparable to Spec#, Microsoft’s PreFix/Fast, Java’s FindBugs, and PCLint. I tentatively call it NStatic. I’m looking at a beta release just before Christmas (one month from now).

I could also add possible support for dynamic languages like Ruby and Boo, depending on how inexpensive such extensions are to add and how valuable such support would be to my own build process, which does incorporate dynamic languages. Building a parser for a new language is trivial for me; it’s the additional language services such as a complicated generic type system and method disambiguation that are time killers. On the other hand, adding more language support is a distraction from my main product.

Here’s how static analysis relates to my main product. Programming languages are much simpler to parse, analyze and work with than natural language and provide a natural intermediate step toward the real thing. Programs of different languages are parsed into a common universal expression language, which also makes it trivial to convert code from one languages to another a la CodeDom. I also capture syntactical elements such as comments and preprocessing instructions. This is the same representation that will be used for my natural language expressions.

The tool tests my AI backend, and the time investment in building this tool is low on top of my existing codebase. I also have prior experience from writing another much simpler static analysis tool for C++/C, CStatic, several years ago, based on symbolic pattern matching. I submitted that command line tool to the Larkware Contest after a simple recompile to Visual Studio 2005, to see if I could win a prize by default—unfortunately, no such luck; I think the judges had a bias for graphical interfaces. (Larkware is a good example of a tastefully done commercial blog.)

My upcoming NStatic tool is different from most other static analysis tools in several ways:

  • Static execution through multiple paths in a flowgraph locating infinite loops, dead code, condition violations, and exception-only code paths based on the dynamic types and values of variables.
    • This is time-consuming step, since large functions generate an exponentially increasing number of code paths.
    • There’s a blurring of compile and runtime, because, in some cases, code may actually be run, and many Framework functions are recognized intrinsically.
    • Spec# appears to transform procedural code to a functional representation, and this would be a much better long-term approach. However, since static analysis is not my main focus, it’s not a strategy I will invest in.
  • Symbolic computation. Symbols and functions, not just numeric values, can be manipulated algebraically.
  • Interprocedural analysis. Theorems are extrapolated from function body, so errors straddling function boundaries are caught. APIs calls in the .NET Framework are pre-analyzed and parameters are validated statically.
  • High-level, declarative rule language for specifying both syntactical and semantic constraints and providing some support for specifications as in Spec#. The language is design to closely match human intent.

These are typically hard to do but, with the right backend, can be straightforward to implement. Some features could be dropped before beta.

There aren’t any good static analysis tools for .NET, and what I am offering should be smarter than many commercial implementations. I haven’t checked out Team System, but the free version of FXCop appears to be mostly a style checker. Given that static analysis tools are already commercially successful, I should be able to command a good price.

There are many options for delivering a free lite and a paid pro version of the tool. For instance, my tool can produce good results in seconds, but, if it is allowed to run continuously for days or weeks, it would explore more of the abstract state space of the program and uncover more difficult-to-find bugs.

Comments

 

Navigation

Categories

About

SoftPerson develops innovative new desktop software applications by incorporating artificial intelligence and natural language technologies to bring human-like intelligence to everyday applications.

Social Media