Wednesday, December 29, 2010

SICP 2.19: Counting Change Revisited

From SICP section 2.2.1 Representing Sequences

Exercise 2.19 asks us to modify the change counting procedure from section 1.2.2 (which we looked at in depth in exercise 1.14). We need to modify the program so that it takes a list of coins to be used instead of having the denominations hard coded as they were originally. The following lists of coins are provided:
(define us-coins (list 50 25 10 5 1))
(define uk-coins (list 100 50 20 10 5 2 1 0.5))

We're also provided with the following modified version of the cc procedure:
(define (cc amount coin-values)
(cond ((= amount 0) 1)
((or (< amount 0) (no-more? coin-values)) 0)
(else
(+ (cc amount
(except-first-denomination coin-values))
(cc (- amount
(first-denomination coin-values))
coin-values)))))

All that's left for us to do is define the procedures first-denomination, except-first-denomination, and no-more? in terms of primitive list operations. These procedures are very straightforward.
; return true if passed an empty list
(define (no-more? coin-values)
(if (null? coin-values) true false))

; return all coin values except the first
(define (except-first-denomination coin-values)
(cdr coin-values))

; return only the first coin value
(define (first-denomination coin-values)
(car coin-values))

> (cc 100 us-coins)
292
> (cc 100 uk-coins)
104561

Finally, we're asked if the order of the list coin-values affects the answer produced by cc. We can find out quickly enough by experiment.
; reversed list of us coins
(define su-coins (list 1 5 10 25 50))

> (cc 100 us-coins)
292
> (cc 100 su-coins)
292

We can see that the order doesn't matter. This is because the procedure recursively evaluates every sub-list after subtracting the value of the first coin from the target amount. It doesn't matter if that value is the smallest, largest, or even if the values are shuffled.


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Monday, December 27, 2010

SICP 2.18: Reversing a List

From SICP section 2.2.1 Representing Sequences

Exercise 2.18 asks us to define a procedure reverse that takes a list as its argument and returns a list with the same elements in reverse order.

We'll be using the append procedure given in the text which takes two lists and appends the second list to the end of the first:
(define (append list1 list2)
(if (null? list1)
list2
(cons (car list1) (append (cdr list1) list2))))

We can reverse a list by appending the car of the list to the reverse of the cdr of the list. (See how easy it is to start thinking in Scheme?) In other words, we just move the first item to the end of the list after reversing the remaining items. This means reverse will make a recursive call to itself, so we also need a base case to make the recursion stop. We can just stop when we reach an empty list.
(define (reverse items)
(if (null? items)
items
(append (reverse (cdr items)) (list (car items)))))

Once again, we can test it out with a few example lists from the text.
> (reverse (list 1 2 3 4))
(4 3 2 1)
> (reverse (list 1 4 9 16 25))
(25 16 9 4 1)
> (reverse (list 1 3 5 7))
(7 5 3 1)
> (reverse (list 23 72 149 34))
(34 149 72 23)


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

SICP 2.17: Last Item in a List

From SICP section 2.2.1 Representing Sequences

Exercise 2.17 asks us to define a procedure last-pair that returns a list that contains only the last element of a given (non-empty) list.

Earlier in SICP section 2.2.1 we saw that a list is represented in Scheme as a chain of pairs.


Our new procedure needs to return a list, so we need to be careful not to simply return the last value in the list, or even the last pair.

There are several correct ways to do this, but here's a recursive solution (based on the length example from the text) that uses the method of "cdring down" the list.
(define (last-pair items)
(if (null? (cdr items))
(list (car items))
(last-pair (cdr items))))

The first line checks to see if the cdr of the list of items passed in is nil. If it is, the second line creates a new list from the car of the items. If not, the last line makes a recursive call to last-pair with the remaining items in the list.

We can verify that it works by testing it with several of the example lists from the chapter.
> (last-pair (list 1 2 3 4))
(4)
> (last-pair (list 1 4 9 16 25))
(25)
> (last-pair (list 1 3 5 7))
(7)
> (last-pair (list 23 72 149 34))
(34)


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Saturday, December 18, 2010

SICP 2.12 - 2.16: Extended Exercise: Interval Arithmetic (Part 2)

From SICP section 2.1.4 Extended Exercise: Interval Arithmetic

The solutions in the first half of this extended exercise, SICP 2.7 - 2.11: Extended Exercise: Interval Arithmetic (Part 1), are only one possible way to approach the problem of implementing interval arithmetic. Another way to represent intervals is as a center value and an added tolerance. The following alternate constructor and selectors are supplied in the text:
; Represent intervals as a center value and a width
(define (make-center-width c w)
(make-interval (- c w) (+ c w)))

(define (center i)
(/ (+ (lower-bound i) (upper-bound i)) 2))

(define (width i)
(/ (- (upper-bound i) (lower-bound i)) 2))

Here's a test case to show how it works:
> (define a (make-center-width 5 1))
> a
(4 . 6)
> (center a)
5
> (width a)
1

A third way of representing intervals, often used by engineers, is as a center value and a percent tolerance measured as the ratio of the width of the interval to its center value.



Exercise 2.12 asks us to define a constructor make-center-percent that takes a center and a percent tolerance and creates the desired interval. We'll also need to define a selector percent that extracts the percentage tolerance for a given interval.

We can define our new constructor in terms of make-center-width. We'll make our procedure take a whole number percentage p.
(define (make-center-percent c p)
(make-center-width c (* c (/ p 100.0))))

Since the percent tolerance of an interval is the ratio of the interval's width to its center, we can implement percent in terms of the provided width and center procedures.
(define (percent i)
(* 100.0 (/ (width i) (center i))))

Here's how we'd define the same interval as before with the new constructor:
> (define a (make-center-percent 5 20))
> a
(4.0 . 6.0)
> (center a)
5.0
> (width a)
1.0
> (percent a)
20.0



Exercise 2.13 asks us to show that there is, for small tolerances, a simple formula for the approximate percentage tolerance of the product of two intervals in terms of the tolerances of the factors. We can assume that all numbers are positive.

Recall the old definition of interval multiplication that we used was:

[a,b] × [c,d] = [min (ac, ad, bc, bd), max (ac, ad, bc, bd)]

Since we're assuming all positive numbers for this exercise, we can change the definition to a simpler form:

[a,b] × [c,d] = [ac, bd]

Now we need to complicate things again and look at what it means to represent an interval in terms of its center (ci) and percent tolerance (pi).

i = [ci - ci(pi/100), ci + ci(pi/100)]
i = [ci(1 - pi/100), ci(1 + pi/100)]

Multiplying two intervals x and y together would be:

xy = [cxcy(1 - px/100)(1 - py/100), cxcy(1 + px/100)(1 + py/100)]

We can use the FOIL method to combine some of the terms above.

(1 - px/100)(1 - py/100) = (1 - py/100 - px/100 + pxpy/10,000)
= 1 - (px + py)/100 + pxpy/10,000

(1 + px/100)(1 + py/100) = 1 + py/100 + px/100 + pxpy/10,000
= 1 + (px + py)/100 + pxpy/10,000

Inserting those back into the interval notation we get:

xy = [cxcy(1 - (px + py)/100 + pxpy/10,000), cxcy(1 + (px + py)/100 + pxpy/10,000)]

Since the exercise lets us assume that both px and py are small, and because we're only looking for an approximation, we can ignore the pxpy/10,000 terms because they'll be very small.

xy = [cxcy(1 - (px + py)/100), cxcy(1 + (px + py)/100)]

Now we have things back in terms of an interval's center and percent tolerance. The center of the product of the two intervals is cxcy, and the percent tolerance is (px + py). The approximate percentage tolerance of the product of the two intervals is the sum of the tolerances of the two factors.

We can check this with a quick example.
> (define a (make-center-percent 5 2))
> (define b (make-center-percent 10 3))
> (define c (mul-interval a b))
> (percent c)
4.997001798920647



Electrical resistor values are often expressed as a center value and a percent tolerance, so an engineer writing programs that work with resistances would be likely to make use of an interval arithmetic library. The formula for parallel resistors can be written in two algebraically equivalent ways:

and
The following two programs implement the two different computations:
(define (par1 r1 r2)
(div-interval (mul-interval r1 r2)
(add-interval r1 r2)))

(define (par2 r1 r2)
(let ((one (make-interval 1 1)))
(div-interval one
(add-interval (div-interval one r1)
(div-interval one r2)))))


Exercise 2.14 asks us to demonstrate that the two programs above give different answers for the same inputs by investigating the behavior of the system on a variety of arithmetic expressions.

First we'll make some intervals and look at the results of interval division in terms of center and percent.
> (define a (make-center-percent 100 5))
> (define b (make-center-percent 200 2))
> (define aa (div-interval a a))
> aa
(0.9047619047619049 . 1.1052631578947367)
> (define ab (div-interval a b))
> ab
(0.46568627450980393 . 0.5357142857142857)
> (center aa)
1.0050125313283207
> (center ab)
0.5007002801120448
> (percent aa)
9.97506234413964
> (percent ab)
6.993006993006991

What's notable here is that the center of the result of dividing an interval by itself is not 1, but just an approximation to it. This will be important when we explain what's wrong with our interval system in the next exercise. For now we're just trying to show that something isn't right.

We can use the same intervals that we defined above to illustrate the difference between the two parallel resistance procedures.
> (define apb1 (par1 a b))
> (define apb2 (par2 a b))
> apb1
(60.25889967637541 . 73.6082474226804)
> apb2
(63.986254295532646 . 69.32038834951456)

> (define apa1 (par1 a a))
> (define apa2 (par2 a a))
> apa1
(42.97619047619048 . 58.026315789473685)
> apa2
(47.5 . 52.49999999999999)

This verifies that there is a significant difference between the two procedures.


Exercise 2.15 points out that a formula to compute with intervals using the library we've been developing will produce tighter error bounds if it can be written in such a form that no variable that represents an uncertain number (an interval) is repeated. The conclusion is that par2 is a "better" program for parallel resistances than par1. We need to explain if this is correct, and why.

First, this is a good time to show that the two formulas used to develop par1 and par2 are algebraically equivalent. We're trying to show that


We'll start with the formula on the right hand side and derive the formula on the left. To do so, all we really need to do is multiply by R1/R1 and R2/R2. Since both of these fractions equal 1, these are valid transformations.


So the formulas are algebraically equivalent, but they don't give the same answer. Why could that be? The answer lies in the trick we used just now to show equivalence. We used the ratios R1/R1 and R2/R2 to change the formula and said that it was okay because that's just like multiplying by 1. But R1 and R2 represent resistor values, which are intervals, and we saw in exercise 2.14 that dividing an interval by itself doesn't equal 1, it just approximates it. Transforming the equation in this way introduces error. That's why the observation that we can get tighter error bounds if we avoid repeating variables that represent uncertain numbers is correct.



Exercise 2.16 asks us to explain why equivalent algebraic expressions may lead to different answers. It goes on to ask if we can devise an interval-arithmetic package that does not have this shortcoming, or if the task is impossible.

As the text points out, this is a very difficult problem. Wikipedia to the rescue.

The so-called dependency problem is a major obstacle to the application of interval arithmetic. Although interval methods can determine the range of elementary arithmetic operations and functions very accurately, this is not always true with more complicated functions. If an interval occurs several times in a calculation using parameters, and each occurrence is taken independently then this can lead to an unwanted expansion of the resulting intervals.

...

In general, it can be shown that the exact range of values can be achieved, if each variable appears only once. However, not every function can be rewritten this way.


In short, no we cannot design an interval arithmetic package that does not have this shortcoming in the general case. The best we can do, as was indicated in the previous exercise, is to try and write formulas that avoid repeating variables that represent intervals. This is not always possible.


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Saturday, December 4, 2010

SICP 2.7 - 2.11: Extended Exercise: Interval Arithmetic (Part 1)

From SICP section 2.1.4 Extended Exercise: Interval Arithmetic

Section 2.1.4 is a small project that has us design and implement a system for working with intervals (objects that represent the range of possible values of an inexact quantity). We need to implement interval arithmetic as a set of arithmetic operations for combining intervals. The result of adding, subtracting, multiplying, or dividing two intervals is itself an interval, representing the range of the result. We're provided with the procedures for adding, multiplying, and dividing two intervals.

The minimum value the sum could be is the sum of the two lower bounds and the maximum value it could be is the sum of the two upper bounds:
(define (add-interval x y)
(make-interval (+ (lower-bound x) (lower-bound y))
(+ (upper-bound x) (upper-bound y))))

The product of two intervals is computed by finding the minimum and the maximum of the products of the bounds and using them as the bounds of the resulting interval.
(define (mul-interval x y)
(let ((p1 (* (lower-bound x) (lower-bound y)))
(p2 (* (lower-bound x) (upper-bound y)))
(p3 (* (upper-bound x) (lower-bound y)))
(p4 (* (upper-bound x) (upper-bound y))))
(make-interval (min p1 p2 p3 p4)
(max p1 p2 p3 p4))))

We can divide two intervals in terms of the mul-interval procedure by multiplying the first interval by the reciprocal of the second. Note that the bounds of the reciprocal interval are the reciprocal of the upper bound and the reciprocal of the lower bound, in that order.
(define (div-interval x y)
(mul-interval x
(make-interval (/ 1.0 (upper-bound y))
(/ 1.0 (lower-bound y)))))


Exercise 2.7 gives us the implementation for make-interval and asks us to complete the interval abstraction by implementing the selectors.
; by convention the bounds are added in lower-upper order
(define (make-interval a b) (cons a b))

(define (upper-bound intrvl)
(cdr intrvl))

(define (lower-bound intrvl)
(car intrvl))

Now that we have most of our definitions in place, we can start testing some of the given procedures.
> (define a (make-interval 5 10))
> (define b (make-interval 10 20))
> (add-interval a b)
(15 . 30)
> (mul-interval a b)
(50 . 200)
> (div-interval a b)
(0.25 . 1.0)


Exercise 2.8 asks us to implement a procedure for computing the difference of two intervals, sub-interval, using reasoning similar to that used to implement add-interval. The smallest value possible when subtracting interval y from interval x is to subtract the upper bound of y from the lower bound of x. The largest result of subtracting interval y from interval x is to subtract the lower bound of y from the upper bound of x.
; interval subtraction: [a,b] − [c,d] = [a − d, b − c]
(define (sub-interval x y)
(make-interval (- (lower-bound x) (upper-bound y))
(- (upper-bound x) (lower-bound y))))

We can test this with a few different intervals to show that it works whether the intervals overlap or not.
> (define a (make-interval 1 10))
> (define b (make-interval 50 100))
> (define c (make-interval 5 20))
> (sub-interval b a)
(40 . 99)
> (sub-interval a b)
(-99 . -40)
> (sub-interval a c)
(-19 . 5)
> (sub-interval c a)
(-5 . 19)


Exercise 2.9 defines the width of an interval as half the difference between its upper and lower bounds. For some arithmetic operations the width of the result of combining two intervals is a function only of the widths of the argument intervals, whereas for others the width of the combination is not a function of the widths of the argument intervals. We are to show that the width of the sum (or difference) of two intervals is a function only of the widths of the intervals being added (or subtracted).

I think this is best proven mathematically by showing that the width of the sum of two intervals is the same as the sum of the widths of two intervals. We start with a couple of definitions for interval addition and computing the width of an interval:

[a,b] + [c,d] = [a + c, b + d]
width([x, y]) = (y - x) / 2

Now we can combine these to come up with a formula for the width of the sum of two intervals.

width([a, b] + [c, d]) = width([a + c, b + d])
= ((b + d) - (a + c)) / 2

Finally, we can derive a formula for the sum of the widths of two intervals.

width([a, b]) + width([c, d]) = (b - a) / 2 + (d - c) / 2
= ((b - a) + (d - c)) / 2
= ((b + d) - (a + c)) / 2

This is the same formula we derived above. This proves that if we add any two intervals with widths x and y, the width of the resulting interval will always be z, no matter what the bounds of the intervals are.

To prove that this is not the case for interval multiplication we can use a simple counterexample.
> (define a (make-interval 2 4))
> (define b (make-interval 5 10))
> (define c (make-interval 10 15))
> (mul-interval a b)
(10 . 40)
> (mul-interval a c)
(20 . 60)

The intervals b and c have the same width, but when we multiply each of them by interval a, the resulting intervals have different widths. This means that the width of the product of two intervals cannot be a function of only the widths of the operands.


Exercise 2.10 points out that it is not clear what it means to divide by an interval that spans zero. We need to modify the provided div-interval routine to check for this condition and signal an error if it occurs.

Before we make the required modifications, let's take a closer look at the div-interval procedure to see what the problem is.
(define (div-interval x y)
(mul-interval x
(make-interval (/ 1.0 (upper-bound y))
(/ 1.0 (lower-bound y)))))

If the interval y spans zero, then there's a small problem with this portion of the code:
(make-interval (/ 1.0 (upper-bound y))
(/ 1.0 (lower-bound y))

If y is the interval [-10, 10], for example, then the snippet above would produce the interval [0.1, -0.1], which has a lower bound that is higher than its upper bound.

To fix the problem we can just do as the exercise suggests and signal an error if a zero-spanning interval is found in the divisor.
; Exercise 2.10
(define (spans-zero? y)
(and (<= (lower-bound y) 0)
(>= (upper-bound y) 0)))

(define (div-interval x y)
(if (spans-zero? y)
(error "Error: The denominator should not span 0.")
(mul-interval x
(make-interval (/ 1.0 (upper-bound y))
(/ 1.0 (lower-bound y))))))

Run an example in your interpreter to verify.
> (define a (make-interval 2 5))
> (define b (make-interval -2 2))
> (div-interval a b)
. . Error: The denominator should not span 0.


Exercise 2.11 suggests that by testing the signs of the endpoints of the intervals, we can break mul-interval into nine cases, only one of which requires more than two multiplications. We are to rewrite mul-interval using this suggestion.

The suggestion is based on the result of multiplication of two numbers with the same or opposite signs. For each interval there are three possibilities, both signs are positive, both are negative, or the signs are opposite. (Note that an interval with the signs [+, -] is not allowed, since the lower bound would be higher than the upper bound.) Since there are two intervals to check, that makes nine possibilities. All nine possibilities are listed below.

[+, +] * [+, +]
[+, +] * [-, +]
[+, +] * [-, -]

[-, +] * [+, +]
[-, +] * [-, +]
[-, +] * [-, -]

[-, -] * [+, +]
[-, -] * [-, +]
[-, -] * [-, -]

For most of the combinations above, we can see directly which pairs need to be multiplied to form the resulting interval. For example, if all values are positive, then multiplying the two upper bounds and two lower bounds are the only two products we need to find. The only case where we need to do more than two multiplications is in the case [-, +] * [-, +], since the product of the two lower bounds could possibly be larger than the product of the two upper bounds.

Note that the following code is much less readable and would be harder to maintain than the original procedure. In addition to that, without benchmarking both procedures we aren't even sure which one is faster. The trade-off in maintainability certainly doesn't seem to be worth any potential savings you might get from eliminating two multiplications from most cases. The developer who suggested this enhancement should probably have their commit access revoked (and be shot out of a cannon).
; Exercise 2.11
(define (mul-interval x y)
(let ((xlo (lower-bound x))
(xhi (upper-bound x))
(ylo (lower-bound y))
(yhi (upper-bound y)))
(cond ((and (>= xlo 0)
(>= xhi 0)
(>= ylo 0)
(>= yhi 0))
; [+, +] * [+, +]
(make-interval (* xlo ylo) (* xhi yhi)))
((and (>= xlo 0)
(>= xhi 0)
(<= ylo 0)
(>= yhi 0))
; [+, +] * [-, +]
(make-interval (* xhi ylo) (* xhi yhi)))
((and (>= xlo 0)
(>= xhi 0)
(<= ylo 0)
(<= yhi 0))
; [+, +] * [-, -]
(make-interval (* xhi ylo) (* xlo yhi)))
((and (<= xlo 0)
(>= xhi 0)
(>= ylo 0)
(>= yhi 0))
; [-, +] * [+, +]
(make-interval (* xlo yhi) (* xhi yhi)))
((and (<= xlo 0)
(>= xhi 0)
(<= ylo 0)
(>= yhi 0))
; [-, +] * [-, +]
(make-interval (min (* xhi ylo) (* xlo yhi))
(max (* xlo ylo) (* xhi yhi))))
((and (<= xlo 0)
(>= xhi 0)
(<= ylo 0)
(<= yhi 0))
; [-, +] * [-, -]
(make-interval (* xhi ylo) (* xlo ylo)))
((and (<= xlo 0)
(<= xhi 0)
(>= ylo 0)
(>= yhi 0))
; [-, -] * [+, +]
(make-interval (* xlo yhi) (* xhi ylo)))
((and (<= xlo 0)
(<= xhi 0)
(<= ylo 0)
(>= yhi 0))
; [-, -] * [-, +]
(make-interval (* xlo yhi) (* xlo ylo)))
((and (<= xlo 0)
(<= xhi 0)
(<= ylo 0)
(<= yhi 0))
; [-, -] * [-, -]
(make-interval (* xhi yhi) (* xlo ylo))))))

We'll need several test cases to make sure we get good coverage of all possible conditions.
> (define a (make-interval 2 4))
> (define b (make-interval -2 4))
> (define c (make-interval -4 -2))
> (mul-interval a a)
(4 . 16)
> (mul-interval a b)
(-8 . 16)
> (mul-interval a c)
(-16 . -4)
> (mul-interval b a)
(-8 . 16)
> (mul-interval b b)
(-8 . 16)
> (mul-interval b c)
(-16 . 8)
> (mul-interval c a)
(-16 . -4)
> (mul-interval c b)
(-16 . 8)
> (mul-interval c c)
(4 . 16)


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.