Conversion of Floating Point Numbers from Binary to Decimal

After introducing floating point numbers and sharing a function to convert a floating point number to its binary representation in the first two posts of this series, I would like to provide a function that converts a binary string to a floating point number. In addition. I will convert between different types of binary representations and discuss their merits.

Conversion from binary to decimal

Conversion from a floating point binary representation to decimal can be performed with different several methods. The first method involves converting the significand and exponent to decimal, multiplying them, and then assigning the sign bit. Control statements for special numbers (e.g., NaNs, Infs, and denormals) are also necessary in this method. As I mentioned in the last past, this would be instructive but slow. The second method, demonstrated below, involves the use of typecast. A call to bin2dec is performed on the binary string to convert it into a decimal format. A cast to a 64-bit unsigned integer using uint64 is then performed, followed by a typecast to double. However, as the following example shows, this method does not preserve precision, and the least significant digits are lost. In this example, the last 9 digits are lost, but the number of digits lost depends on the number.

>> b = float2bin(0.1)
b = 0011111110111001100110011001100110011001100110011001100110011010
>> f = typecast(uint64(bin2dec(float2bin(0.1))),'double')
f =  0.0999999999999943
>> g = float2bin(f)
g = 0011111110111001100110011001100110011001100110011001100000000000

We recommend a third method that performs the steps of float2bin in reverse order, which is presented below.

function f = bin2float(b)
%This function converts a binary string to a floating point number.

if ~ischar(b)
  disp('Input must be a character string.');
  return;
end

hex = '0123456789abcdef'; %Hex characters

%Reshape into 4x(L/4) character array
bins = reshape(b,4,numel(b)/4).'; 

%Convert to numbers in range of (0-15)
nums = bin2dec(bins); 

%Convert to hex characters
hc = hex(nums + 1); 

%Convert from hex to float
f = hex2num(hc);

This method simply reshapes the binary string into a character array of “0”s and “1”s with rows of length 4, then converts each row to an integer from 0 to 15. The corresponding hex values (0 – f) are obtained by accessing the locations of the hex array, and finally, the floating point decimal value is produced by utilizing hex2num. The following example shows that this function preserves all of the digits of the binary representation.

>> b = float2bin(0.1)
b = 0011111110111001100110011001100110011001100110011001100110011010
>> f = bin2float(b)
f =  0.100000000000000
>> g = float2bin(f)
g = 0011111110111001100110011001100110011001100110011001100110011010

Conversion between binary formats

Depending on the application, it may be preferable to hold the binary representation of a floating point number in a certain data type. For this reason, we will now discuss conversions among binary logical vectors, numerical vectors, and character strings. Conversion from a character string is to a numerical vector is quite simple. In the interest of readability, float2bin outputs binary numbers in the form of a character string. If it is necessary to perform calculations with the binary representation of a float, a numerical or logical format is preferable. Additionally, if there are tight constraints on memory usage in a program, it may be necessary to use a format other than a vector of doubles to store your bit vector.

The commands presented below show how to convert between various types of bit vectors, including character strings, logical vectors, and various numerical vectors. As shown in the whos table, the various data types occupy different amounts of memory. In order of increasing memory consumption, we have the original floating point number; the hexadecimal string; a tie at 64 bytes among the character string, the logical vector, and the int8 vector; the single vector; and the double vector. Thus, representing a number as a bit vector will always carry a memory cost of a factor of 8 to 64, and the most efficient way to store floating point data is in its original format, as either a double or single.

>> f = 0.1;
>> h = num2hex(f); %Convert to hex
>> bitstr = float2bin(f);  %Floating-point decimal to binary string
>> logvec_bs = (bitstr == '1');  %Binary string to logical vector
>> numvec_bs = bitstr - "0";  %Binary string to numerical vector
>> logvec_nv = logical(numvec_bs);  %Numerical vector to logical vector
>> doubvec_lv = double(logvec_bs);  %Logical vector to double vector
>> singvec_lv = single(logvec_bs);  %Logical vector to single vector
>> int16vec_lv = int16(logvec_bs);  %Logical vector to int16 vector
>> int8vec_lv = int8(logvec_bs);  %Logical vector to int8 vector
>> bitstr_lv = char(logvec_bs + 48); %Logical vector to binary string

>> whos 
Variables in the current scope:

   Attr Name             Size                     Bytes  Class
   ==== ====             ====                     =====  ===== 
        f                1x1                          8  double
        h                1x16                        16  char
        bitstr           1x64                        64  char
        logvec_bs        1x64                        64  logical
        numvec_bs        1x64                       512  double
        logvec_nv        1x64                        64  logical
        doubvec_lv       1x64                       512  double
        singvec_lv       1x64                       256  single
        int16vec_lv      1x64                       128  int16
        int8vec_lv       1x64                        64  int8

Total is 512 elements using 1664 bytes

Conversions between the various formats require several tricks. Converting from a string of binary characters to a logical vector involves the use of a comparison statement. Each “1” in the character string is converted to a value of true in the logical vector. Surprisingly, the logical data type requires one byte per element, which is strange because only one bit should be sufficient for a boolean value. Thus, using a logical vector will not save memory. Converting from a binary string to a numerical vector involves a subtraction by “0”. This method takes the difference between each character in the bit string and “0”, which results in a vector of 0s and 1s. An explicit type cast is not even necessary. Converting back to a binary string from a logical or numerical vector requires an addition of 48 and a cast to the character data type. This method takes advantage of the fact that the ASCII value of “0” is 48. Note that logical data are implicitly cast to the data type of the other operand in an expression. Conversions between numerical and logical vectors are more simple, as they only require casts to those classes, as shown in the examples above.

The functions described in this post and the last post have been submitted to the Matlab Central File Exchange:

http://www.mathworks.com/matlabcentral/fileexchange/39113-floating-point-number-conversion

Please feel free to download this package and use it. If you have any comments or suggestions, please leave them on the File Exchange website.

Hopefully, the techniques discussed in this post will be useful for you. In the next post, the pitfalls of using floating point numbers in comparison statements and the solutions to these pitfalls will be discussed.

3 thoughts on “Conversion of Floating Point Numbers from Binary to Decimal

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.