Floating-Point Numbers

8/14/2006 5:32:57 AM

Floating-Point Numbers

Working with floating-point numbers can be a pain. I have written several posts on floating-point arithmetic including Numbers in .NET, Floating Point Arithmetic I and II.

There is a good article on “What Every Computer Scientist Should Know About Floating-Point Arithmetic.” There’s also another great article which describes what, I believe, is the best overall algorithm for quickly and accurately comparing two floating numbers. The algorithm reinterprets floating-point numbers as integral values after some adjustment for negative values.

The decimal data type introduced in the .NET framework is meant to alleviate problems with binary-based representations. A decimal number is entered with “m” appended to it (eg, 3.0m). The decimal data type can represent fractional decimal values exactly up to certain level of precision, where as the standard double usually nearly always represent such decimal fraction values imprecisely. For example, 0.5 can be represented exactly as a double, but not any of the other numbers of the form 0.x, where x is neither 5 nor 0. However, division still remains problematic for the decimal data type because of its finite precision. For instance, the value (1m / 3m) * 3m will not be recognized as equal to 1m. In fact, that value will print out as “0.9999999999…” with 28 nines, the maximum precision of a decimal; in the CLR, .999… doesn’t equal one.

These problems won’t go away with either the BigInteger and BigDecimal types that are rumored to be introduced (or more accurately, ported from J#) into a future version of the .NET Base Class Library; infinite precision is required. One of these days, someone will invent a BigRational which is a ratio of two BigIntegers, or an extended BigDecimal, which can encode repeated decimals; we would then only have to worry about irrational numbers like square roots and pi.

Currently, I am adding a new numeric data type, that I called Number, to resolve some issues I have with NStatic tool. Previously, I relied on doubles and used the comparison method mentioned above for doubles. However, I decided that I wanted to represent long integers exactly (in addition to double value) and also to be able represent most fractional values, obtained from division, exactly. So, my new Number type is a 16–byte data type that basically combines doubles, long integers, and rational numbers (up to certain limit). I try to represent numbers as long integers if possible,  then transition to rational numbers, and fallback on doubles. I’m not quite ready for arbitrarily long numbers, which are much slower and whose only application seems to be cryptography. The Number type is a compromise that balances performance and exactness. However, it still won’t be able to represent the full set of values of a decimal, but programs that use the decimal data type are relatively rare and, even then, almost never need more than 64 bits of precision.







Net Undocumented is a blog about the internals of .NET including Xamarin implementations. Other topics include managed and web languages (C#, C++, Javascript), computer science theory, software engineering and software entrepreneurship.

Social Media