Intermezzo: The Nature of Numbers
When it comes to numbers, programming languages mediate the gap between the
underlying hardware and true mathematics. The typical computer hardware
represents numbers with fixedsize chunks of data;These
chunks are called; bits, bytes, and words.
they also come with processors that work on just such chunks. In
paperandpencil calculations, we don’t worry about how many digits we
process; in principle, we can deal with numbers that consist of one digit,
10 digits, or 10,000 digits. Thus, if a programming language uses the
numbers from the underlying hardware, its calculations are as efficient as
possible. If it sticks to the numbers we know from mathematics, it must
translate those into chunks of hardware data and back—
This intermezzo explains the hardware representation of numbers as an exercise in data representation. Concretely, the first subsection introduces a concrete fixedsize data representation for numbers, discusses how to map numbers into this representation, and hints at how calculations work on such numbers. The second and third section illustrate the two most fundamental problems of this choice: arithmetic overflow and underflow, respectively. The last one sketches how arithmetic in the teaching languages works; its number system generalizes what you find in most of today’s programming languages. The final exercises show how bad things can get when programs compute with numbers.
Fixedsize Number Arithmetic
Suppose we can use four digits to represent numbers. If we represent natural numbers, one representable range is [0,10000). For real numbers, we could pick 10,000 fractions between 0 and 1 or 5,000 between 0 and 1 and another 5,000 between 1 and 2, etc. In either case, four digits can represent at most 10,000 numbers for some chosen interval, while the number line for this interval contains an infinite number of numbers.
a mantissa, which is a base number, and
an exponent, which is used to determine a 10based factor.
; N Number N > Inex ; make an instance of Inex after checking the arguments (define (createinex m s e) (cond [(and (<= 0 m 99) (<= 0 e 99) (or (= s 1) (= s 1))) (makeinex m s e)] [else (error 'inex "(<= 0 m 99), s in {+1,1}, (<= 0 e 99) expected")])) ; Inex > Number ; convert an inex into its numeric equivalent (define (inex>number aninex) (* (inexmantissa aninex) (expt 10 (* (inexsign aninex) (inexexponent aninex)))))
(definestruct inex [mantissa sign exponent]) ; An Inex is a structure: ; (makeinex N99 S N99) ; An S is one of: ; – 1 ; – 1 ; An N99 is an N between 0 and 99 (inclusive).
(createinex 12 1 2)
> (createinex 120 1 1) inex: (<= 0 m 99), s in {+1,1}, (<= 0 e 99) expected
> (createinex 50 1 20) (makeinex 50 1 20)
> (createinex 5 1 19) (makeinex 5 1 19)
(createinex 12 1 2)
(createinex 13 1 2)
(inex+ (createinex 1 1 0) (createinex 2 1 0)) == (createinex 3 1 0)
(inex+ (createinex 55 1 0) (createinex 55 1 0)) == (createinex 11 1 1)
(inex+ (createinex 56 1 0) (createinex 56 1 0)) == (createinex 11 1 1)
(inex* (createinex 2 1 4) (createinex 8 1 10)) == (createinex 16 1 14)
(inex* (createinex 20 1 1) (createinex 5 1 4)) == (createinex 10 1 6)
(inex* (createinex 27 1 1) (createinex 7 1 4)) == (createinex 19 1 4)
Exercise 413. Design inex+. The function adds two Inex representations of numbers that have the same exponent. The function must be able to deal with inputs that increase the exponent. Furthermore, it must signal its own error if the result is out of range, not rely on createinex for error checking.
(equal? (inex+ (createinex 1 1 0) (createinex 1 1 1)) (createinex 11 1 1))
Exercise 414. Design inex*. The function multiplies two Inex representations of numbers, including inputs that force an additional increase of the output’s exponent. Like inex+, it must signal its own error if the result is out of range, not rely on createinex to perform error checking.
Exercise 415. As this section illustrates, gaps in the data representation lead to roundoff errors when numbers are mapped to Inexes. The problem is, such roundoff errors accumulate across computations.
Design add, a function that adds up n copies of #i1/185. Use 0 and 1 for your examples; for the latter, use a tolerance of 0.0001. What is the result for (add 185)? What would you expect? What happens if you multiply the result with a large number?
Design sub. The function counts how often 1/185 can be subtracted from the argument until it is 0. Use 0 and 1/185 for your examples. What are the expected results? What are the results for (sub 1) and (sub #i1.0)? What happens in the second case? Why?
Overflow
(inex+ (createinex 50 1 99) (createinex 50 1 99)) == (createinex 100 1 99)
When overflow takes place, some language implementations signal an error and stop the computation. Others designate some symbolic value, called infinity, to represent such numbers and propagate it through arithmetic operations.
Note If Inexes had a sign field for the mantissa, then two negative numbers can add up to one that is so negative that it can’t be represented either. This is called overflow in the negative direction. End
Exercise 416. ISL+ uses +inf.0 to deal with overflow. Determine the integer n such that (expt #i10.0 n) is an inexact number while (expt #i10. (+ n 1)) is approximated with +inf.0. Hint Design a function to compute n.
Underflow
(inex* (createinex 1 1 10) (createinex 1 1 99)) == (createinex 1 1 109)
When underflow occurs, some language implementations signal an error; others use 0 to approximate the result. Using 0 to approximate underflow is qualitatively different from picking an approximate representation of a number in Inex. Concretely, approximating 1250 with (createinex 12 1 2) drops significant digits from the mantissa, but the result is always within 10% of the number to be represented. Approximating on underflow, however, means dropping the entire mantissa, meaning the result is not within a predictable percentage range of the true result.
Exercise 417. ISL+ uses #i0.0 to approximate underflow. Determine the smallest integer n such that (expt #i10.0 n) is still an inexact ISL+ number and (expt #i10. ( n 1)) is approximated with 0. Hint Use a function to compute n. Consider abstracting over this function and the solution of exercise 416.
*SL Numbers
Most programming languages support only approximate number representations and arithmetic for numbers. In the context of such languages, inexact real representations come in several flavors: float, double, extflonum, etc. It is beyond a first course on programming to explain all possibilities. A typical language limits its integers to an interval that is related to the size of the chunks of the hardware on which it runs. Its representation of real numbers is loosely based on the sketch in the preceding sections, though with larger chunks than the four digits Inex uses and using digits from the 2based number system, because that is how computers work.
The teaching languages support both exact and inexact numbers. Their integers and rationals are arbitrarily large and precise, limited only by the absolute size of the computer’s entire memory. For calculations on these numbers, our teaching languages use the underlying hardware as long as the involved rationals fit into the supported chunks of data; it automatically switches to a different representation and to different version of the arithmetic operations for numbers outside of this interval. Their real numbers come in two flavors: exact and inexact. An exact number truly represents a real number; an inexact one approximates a real number in the spirit of the preceding sections. Arithmetic operations preserve exactness when possible; they produce an inexact result when necessary. Thus, sqrt returns an inexact number for both the exact and inexact representation of 2. In contrast, sqrt produces an exact 2 when given exact 4 and #i2.0 for an input of #i4.0. Finally, a numeric constant in a teaching program is understood as an exact rational, unless it is prefixed with #i.
Plain Racket interprets all decimal numbers as inexact numbers; it also renders all real numbers as decimals, regardless of whether they are exact or inexact. The implication is that all such numbers are dangerous, because they are likely to be inexact approximations of the true number. A programmer can force Racket to interpret numbers with a dot as exact by prefixing numerical constants with #e.
At this point, it is natural to ask how much a program’s results may differ
from the true results if it uses these inexact numbers. For
an accessible introduction—
(expt 1.001 1e12)
(define JANUS (list 31.0 #i2e+34 #i1.2345678901235e+80 2749.0 2939234.0 #i2e+33 #i3.2e+270 17.0 #i2.4e+270 #i4.2344294738446e+170 1.0 #i8e+269 0.0 99.0))
> (expt 2 #i53.0) #i9007199254740992.0
> (sum (list #i1.0 (expt 2 #i53.0))) #i9007199254740992.0
> (sum (list #i1.0 #i1.0 (expt 2 #i53.0))) #i9007199254740994.0
(exact>inexact (sum (map inexact>exact JANUS)))
(define (oscillate n) (local ((define (oscillate i) (cond [(> i n) '()] [else (cons (expt 0.99 i) (oscillate (add1 i)))]))) (oscillate 1)))

> (sum (oscillate #i1000.0)) #i0.49746596003269394
> (sum (reverse (oscillate #i1000.0))) #i0.4974659600326953
The question is which numbers programmers should use in their programs if they are given a choice. The answer depends on the context of course. In the context of a financial statement, numerical constants should be interpreted as exact numbers, and computational manipulations of financial statements ought to be able to rely on the exactnesspreserving nature of mathematical operations. After all, the law cannot accommodate the serious errors that come with inexact numbers and their operations. In the context of scientific computations, however, the extra time needed to produce exact results might impose too much of a burden. Scientists therefore tend to use inexact numbers but carefully analyze their programs to make sure that the numerical errors are tolerable for their uses of the outputs of programs.