Bugs versus Limitations

1/17/2006 11:26:34 AM

Bugs versus Limitations

In my code analysis tool, I capture just about every feature of a source code file into an internal memory representation. The information that I collect will be more useful in future products and internal tools.

However, when it comes to interpreting the code, I ignore a lot of that same information that a compiler may use in processing code. I don’t analyze code in the exact same way as it will eventually be executed. Some may call that a bug, but I call that a limitation.

Dynamic Typing

The interpreter is mostly dynamic-typed, using runtime information to dynamically decide which methods to call. Static type information is used in some cases to mark nulls or unknown objects returned by functions or stored in variables. This frees me from the complexity introduced by the CLR’s type system, especially generics. It also makes it easier to analyze individual code files in isolation. Also, some languages that I may support later may not be statically typed. I also suspect that the brain is dynamically-typed, so I may be headed in the right direction.

One problem is that method resolution can behave differently when there are multiple overloads, but if methods are semantically equivalent, then this should not be a problem.

Numeric Format

For simplicity, I represent all numbers using double-precision floating-point. Unfortunately, doubles cannot precisely represent all the values in a long integer or decimal data type, but the values that can be represented are the most common and useful ones. About one percent (~1.07%) of long values can  be represented precisely in a double, but the other 99% are just as rarely encountered; even in those case, the double representation may be accurate enough.

Using decimals has some advantages over doubles such as fewer rounding errors, better comparison testing, ability to represent all long integers; however, decimals are about 400 times slower and permit a smaller range than decimals. For now, I have limited myself to doubles, although, in the future, I could revisit this issue.

Parsing

My tool assumes that the code to be processed has already compiled successfully in the target language and doesn’t attempt to replicate all the additional post-parse checks performed by a traditional compiler. As long as the syntax matches the language grammar, my tool accepts the code whether or not it would actually successfully compile.

Since this is a diagnostic tool which focuses on finding logical and not compiler errors, I think that these are acceptable limitations. These are probably common in other code-analysis and model-checking tools.

 

Comments

 

Navigation

Categories

About

Net Undocumented is a blog about the internals of .NET including Xamarin implementations. Other topics include managed and web languages (C#, C++, Javascript), computer science theory, software engineering and software entrepreneurship.

Social Media