floating point numbers

21
Floating point numbers

Upload: nero

Post on 22-Feb-2016

63 views

Category:

Documents


0 download

DESCRIPTION

Floating point numbers. Computable reals. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Floating point numbers

Floating point numbers

Page 2: Floating point numbers

Computable reals “computable numbers may be

described briefly as the real numbers whose expressions as a decimal are calculable by finite means.”(A. M. Turing, On Computable Numbers with an Application to the Entschiedungsproblem, Proc. London Mathematical Soc., Ser. 2 , Vol 42, pages 230-265, 1936-7.)

Page 3: Floating point numbers

Look first at decimal reals A real number may be approximated by a

decimal expansion with a determinate decimal point.

As more digits are added to the decimal expansion the precision rises.

Any effective calculation is always finite – if it were not then the calculation would go on for ever.

There is thus a limit to the precision that the reals can be represented as.

Page 4: Floating point numbers

Transcendental numbers In principle, transcendental numbers

such as Pi or root 2 have no finite representation

We are always dealing with approximations to them.

We can still treat Pi as a real rather than a rational because there is always an algorithmic step by which we can add another digit to its expansion.

Page 5: Floating point numbers

First solution Store the numbers in memory just as they

are printed as a string of characters. 249.75Would be stored as 6 bytes as shown belowNote that decimal numbers are in the range 30H

to 39H as ascii codes

32 34 39 2E 37 35

Full stop charChar for 3

Page 6: Floating point numbers

Implications The number strings can be of variable

length. This allows arbitrary precision. This representation is used in systems

like Mathematica which requires very high accuracy.

Page 7: Floating point numbers

Example with Mathematica

5! Out[1]=120 In[2]:=10! Out[2]=3628800 In[3]:=50! Out[3]=3041409320171337804361260816

6064768844377641568960512000000000000

Page 8: Floating point numbers

Decimal byte arithmetic“9”+ “8”= “17” decimal 39H+38H=71H hexadecimal ascii 57+56=113 decimal ascii Adjust by taking 30H=48 away -> 41H=65 If greater than “9”=39H=57 take away

10=0AH and carry 1 Thus 41H-0Ah = 65-10=55=37H so the

answer would be 31H,37H = “17”

Page 9: Floating point numbers

Representing variables Variables are represented as pointers

to character strings in this system A=249.75

A 32 34 39 2E 37 35

Page 10: Floating point numbers

Advantages Arbitrarily precise Needs no special hardwareDisadvantages Slow Needs complex memory management

Page 11: Floating point numbers

Binary Coded Decimal (BCD) or Calculator style floating point Note that 249.75 can be represented

as 2.4975 x 102

Store this 2 digits to a byte to fixed precision as follows

24 97 50 02

32 bits overall Each digit uses 4 bits

exponentmantissa

Page 12: Floating point numbers

NormaliseConvert N to format with one digit in

front of the decimal point as follows:1. If N>10 then Whilst N>10 divide by

10 and add 1 to the exponent2. Else whilst N<1 multiply by 10 and

decrement the exponent

Page 13: Floating point numbers

Add floating point 1. Denormalise smaller number so that

exponents equal2. Perform addition3. RenormaliseEg 949.75 + 52.0 = 1002.759.49750 E02 → 9.49750 E025.20000 E01 → 0.52000 E02 + 10.02750 E02 → 1.00275 E03

Page 14: Floating point numbers

Note loss of accuracy Compare Octave which uses floating point

numbers with Mathematica which uses full precision arithmetic

Octave floating point gives only 5 figure accuracy

Octave fact(5)ans = 120fact(10)ans = 3628800fact(50)ans = 3.0414e+64

Mathematica5!Out[1]=12010!Out[2]=362880050!Out[3]=30414093201713378043612608166064768844377641568960512000000000000

Page 15: Floating point numbers

Loss of precison continued When there is a big difference

between the numbers the addition is lost with floating point

Octave325000000 + 108 ans =

3.2500D+08

MathematicaIn[1]:=325000000 + 108Out[1]=325000108

Page 16: Floating point numbers

IEEE floating point numbersInstitution of Electrical and Electronic Engineers

Page 17: Floating point numbers

Single Precision

E F

Page 18: Floating point numbers

Definition N=-1s x 1.F x 2E-128

Example 13.25In fixed point binary = 11.01 = 1.101 x 21

In IEEE format this iss=0 E=129, F=10100… thus in IEEE it isS E F0|1000 0001|1010 0000 0000 0000 0000 000

Delete this bit

Page 19: Floating point numbers

Example 2 -0.375 = -3/8In fixed point binary = -0.011 =-11 x 1.1 x 2-2

In IEEE format this iss=1 E=126, F=1000 … thus in IEEE it isS E F1|0111 1110|1000 0000 0000 0000 0000 000

Page 20: Floating point numbers

Range IEEE32 1.17 * 10–38 to +3.40 * 1038

IEEE64 2.23 * 10–308 to +1.79 * 10308

80bit 3.37 * 10–4932 to +1.18 * 104932

Page 21: Floating point numbers