Floating Point
Objectives
We would like to :
Gain a basic understanding of the floating point type.
Discuss floating point operations.
Understand common problems with floating point.
Discuss the Math library.
Notes
I like making long notes, but short videos.
So I will probably turn this into 4-5 videos.
Sources
Again, we will use section 2.5 of the book.
And the Oracle documentation.
Floating Point type.
reference
.
This is quite technical.
Floating point numbers are numbers with
A decimal point (3.14)
Or in scientific notation
Very small: 2.9834 x 10
-12
Very large: 9.856 x 10
34
There are two floating point types in java
float, a 32 bit number.
double, a 64 bit number.
These both conform to IEEE 754 (Institute of Electrical and Electronic Engineers) standard number 754.
This is an old technical standard, but works well.
And we will NOT discuss it here.
Literals
All literals are double unless they end with a f
0.1f, 1.0e-1f or 1.0E-1f are all acceptable.
So are 3e23f
3f is acceptable as well.
I can place a d at the end to form a double.
There are more rules for literals, but we will skip them.
Printing
Remember, %3.2f prints two decimal points.
And will rounded up.
%E and %e are used for scientific notation.
Floating point operations.
+,-,*,/ as you would expect.
Don't divide by 0.
Produces
Infinity
.
% exists but I would not use it.
++ and -- as with integers.
+=, -=, ...
The Math Library
There is a mathematics library for java.
Reference
Some constant definitions (Math.PI, Math.E)
MANY functions.
Problems :
We have seen 1/0 produces infinity
There is also negative infinity
And 0.0f/0.0f produce nan
We can't represent all numbers between any two floats.
Look at Math.pow(2.0, 63.0);
double y; y = Math.pow(2.0,63.0); System.out.printf("%.0f\n",y);
Real: 9223372036854775808
Comp: 9223372036854776000
Why
This is the scoreboard problem all over again.
We just ran out of digits, but this time we "put the error on the right."
This is a precision error.: the difference between a computed approximation and the exact result.
In conclusion:
Choose double,
Reduces the chance/impact of overflow, precision errors.
floats might be slightly faster than doubles
But you are working in java, so use a double.
Always be careful however, especially when dealing with very precise quantities.