Browse DevX
Sign up for e-mail newsletters from DevX


Formatting Floating Point Numbers-2 : Page 2




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Presenting the Problem
The inspiration for writing this article comes from a question that was recently posted on the DevX C++ Forum. The poster asked how to write a function that accepts a long double argument and converts it to a string. The resulting string should contain up to two decimal digits in the fraction part. For example, the floating-point value 123.45678 should yield the string "123.45." Seemingly, this is a trivial programming task. However, to make this function truly useful, the design has to be flexible enough toallow the caller to specify a different number of fractional digits. In addition, the same function must handle various exceptions gracefully. For example, it should be able to cope with integral values such as 123.0 or 123.

Before we tackle this task, it's important to remember two design maxims that hold true for every state of the art C++ code base:

  • Maxim #1: Whenever you need to format a numeric value, convert the value to a string. This way, you guarantee that each digit occupies exactly one character.
  • Maxim #2: When you need to convert something to a string, use the <sstream> library.
The interface of the conversion function is straightforward: the first parameter is the value that needs to be formatted. The second parameter represents the number of decimal digits that should appear after the decimal point. The latter will have a default value. The return value is of type string:

string do_fraction(long double value, int decplaces=3);

Note: The number of decimal places includes the decimal point, therefore a default value of 3 is needed for two fractional digits.

Precision and Order
Naturally, your first step is to convert the long double value to a string. Using the standard C++ <sstream> library, this task is a cinch. However, there is one whimsical quirk of which you should be aware. For some reason, the stringstream objects have a default precision of 6. Many programmers mistakenly assume that "precision" refers to number of digits in the fraction part. This isn't true—precision refers to the total number of digits. Thus, the number 1234.56 can be represented safely with a default precision of 6. However, the number 12345.67 will be truncated to 12345.6. So, if you have a larger number, say 1234567.8, the result will be converted silently to the scientific format: 1.23457e+06, which is certainly not what you want. To avoid this nuisance, set the default precision to the maximum before performing any conversion.

To obtain the maximum number of digits that long double can represent, use the <limits> library:

string do_fraction(long double value, int decplaces=3) { int prec= numeric_limits<long double>::digits10; // 18 ostringstream out; out.precision(prec);//override the default out<<value; string str= out.str(); //extract string from stream

The numeric value is now stored in str, waiting to be formatted.

Thanks for your registration, follow us on our social networks to keep up-to-date