Wednesday, December 29, 2010

Exercise 2.19 asks us to modify the change counting procedure from section 1.2.2 (which we looked at in depth in exercise 1.14). We need to modify the program so that it takes a list of coins to be used instead of having the denominations hard coded as they were originally. The following lists of coins are provided:

Finally, we're asked if the order of the list coin-values affects the answer produced by cc. We can find out quickly enough by experiment.

; reversed list of us coins(define su-coins (list 1 5 10 25 50))

> (cc 100 us-coins)292> (cc 100 su-coins)292

We can see that the order doesn't matter. This is because the procedure recursively evaluates every sub-list after subtracting the value of the first coin from the target amount. It doesn't matter if that value is the smallest, largest, or even if the values are shuffled.

Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

We can reverse a list by appending the car of the list to the reverse of the cdr of the list. (See how easy it is to start thinking in Scheme?) In other words, we just move the first item to the end of the list after reversing the remaining items. This means reverse will make a recursive call to itself, so we also need a base case to make the recursion stop. We can just stop when we reach an empty list.

The first line checks to see if the cdr of the list of items passed in is nil. If it is, the second line creates a new list from the car of the items. If not, the last line makes a recursive call to last-pair with the remaining items in the list.

We can verify that it works by testing it with several of the example lists from the chapter.

Saturday, December 18, 2010

The solutions in the first half of this extended exercise, SICP 2.7 - 2.11: Extended Exercise: Interval Arithmetic (Part 1), are only one possible way to approach the problem of implementing interval arithmetic. Another way to represent intervals is as a center value and an added tolerance. The following alternate constructor and selectors are supplied in the text:

A third way of representing intervals, often used by engineers, is as a center value and a percent tolerance measured as the ratio of the width of the interval to its center value.

Exercise 2.12 asks us to define a constructor make-center-percent that takes a center and a percent tolerance and creates the desired interval. We'll also need to define a selector percent that extracts the percentage tolerance for a given interval.

We can define our new constructor in terms of make-center-width. We'll make our procedure take a whole number percentage p.

Exercise 2.13 asks us to show that there is, for small tolerances, a simple formula for the approximate percentage tolerance of the product of two intervals in terms of the tolerances of the factors. We can assume that all numbers are positive.

Recall the old definition of interval multiplication that we used was:

[a,b] × [c,d] = [min (ac, ad, bc, bd), max (ac, ad, bc, bd)]

Since we're assuming all positive numbers for this exercise, we can change the definition to a simpler form:

[a,b] × [c,d] = [ac, bd]

Now we need to complicate things again and look at what it means to represent an interval in terms of its center (ci) and percent tolerance (pi).

Since the exercise lets us assume that both px and py are small, and because we're only looking for an approximation, we can ignore the pxpy/10,000 terms because they'll be very small.

xy = [cxcy(1 - (px + py)/100), cxcy(1 + (px + py)/100)]

Now we have things back in terms of an interval's center and percent tolerance. The center of the product of the two intervals is cxcy, and the percent tolerance is (px + py). The approximate percentage tolerance of the product of the two intervals is the sum of the tolerances of the two factors.

Electrical resistor values are often expressed as a center value and a percent tolerance, so an engineer writing programs that work with resistances would be likely to make use of an interval arithmetic library. The formula for parallel resistors can be written in two algebraically equivalent ways:andThe following two programs implement the two different computations:

What's notable here is that the center of the result of dividing an interval by itself is not 1, but just an approximation to it. This will be important when we explain what's wrong with our interval system in the next exercise. For now we're just trying to show that something isn't right.

We can use the same intervals that we defined above to illustrate the difference between the two parallel resistance procedures.

This verifies that there is a significant difference between the two procedures.

Exercise 2.15 points out that a formula to compute with intervals using the library we've been developing will produce tighter error bounds if it can be written in such a form that no variable that represents an uncertain number (an interval) is repeated. The conclusion is that par2 is a "better" program for parallel resistances than par1. We need to explain if this is correct, and why.

First, this is a good time to show that the two formulas used to develop par1 and par2 are algebraically equivalent. We're trying to show that

We'll start with the formula on the right hand side and derive the formula on the left. To do so, all we really need to do is multiply by R1/R1 and R2/R2. Since both of these fractions equal 1, these are valid transformations.

So the formulas are algebraically equivalent, but they don't give the same answer. Why could that be? The answer lies in the trick we used just now to show equivalence. We used the ratios R1/R1 and R2/R2 to change the formula and said that it was okay because that's just like multiplying by 1. But R1 and R2 represent resistor values, which are intervals, and we saw in exercise 2.14 that dividing an interval by itself doesn't equal 1, it just approximates it. Transforming the equation in this way introduces error. That's why the observation that we can get tighter error bounds if we avoid repeating variables that represent uncertain numbers is correct.

Exercise 2.16 asks us to explain why equivalent algebraic expressions may lead to different answers. It goes on to ask if we can devise an interval-arithmetic package that does not have this shortcoming, or if the task is impossible.

The so-called dependency problem is a major obstacle to the application of interval arithmetic. Although interval methods can determine the range of elementary arithmetic operations and functions very accurately, this is not always true with more complicated functions. If an interval occurs several times in a calculation using parameters, and each occurrence is taken independently then this can lead to an unwanted expansion of the resulting intervals.

...

In general, it can be shown that the exact range of values can be achieved, if each variable appears only once. However, not every function can be rewritten this way.

In short, no we cannot design an interval arithmetic package that does not have this shortcoming in the general case. The best we can do, as was indicated in the previous exercise, is to try and write formulas that avoid repeating variables that represent intervals. This is not always possible.

Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Saturday, December 4, 2010

Section 2.1.4 is a small project that has us design and implement a system for working with intervals (objects that represent the range of possible values of an inexact quantity). We need to implement interval arithmetic as a set of arithmetic operations for combining intervals. The result of adding, subtracting, multiplying, or dividing two intervals is itself an interval, representing the range of the result. We're provided with the procedures for adding, multiplying, and dividing two intervals.

The minimum value the sum could be is the sum of the two lower bounds and the maximum value it could be is the sum of the two upper bounds:

We can divide two intervals in terms of the mul-interval procedure by multiplying the first interval by the reciprocal of the second. Note that the bounds of the reciprocal interval are the reciprocal of the upper bound and the reciprocal of the lower bound, in that order.

Exercise 2.8 asks us to implement a procedure for computing the difference of two intervals, sub-interval, using reasoning similar to that used to implement add-interval. The smallest value possible when subtracting interval y from interval x is to subtract the upper bound of y from the lower bound of x. The largest result of subtracting interval y from interval x is to subtract the lower bound of y from the upper bound of x.

Exercise 2.9 defines the width of an interval as half the difference between its upper and lower bounds. For some arithmetic operations the width of the result of combining two intervals is a function only of the widths of the argument intervals, whereas for others the width of the combination is not a function of the widths of the argument intervals. We are to show that the width of the sum (or difference) of two intervals is a function only of the widths of the intervals being added (or subtracted).

I think this is best proven mathematically by showing that the width of the sum of two intervals is the same as the sum of the widths of two intervals. We start with a couple of definitions for interval addition and computing the width of an interval:

[a,b] + [c,d] = [a + c, b + d]width([x, y]) = (y - x) / 2

Now we can combine these to come up with a formula for the width of the sum of two intervals.

This is the same formula we derived above. This proves that if we add any two intervals with widths x and y, the width of the resulting interval will always be z, no matter what the bounds of the intervals are.

To prove that this is not the case for interval multiplication we can use a simple counterexample.

The intervals b and c have the same width, but when we multiply each of them by interval a, the resulting intervals have different widths. This means that the width of the product of two intervals cannot be a function of only the widths of the operands.

Exercise 2.10 points out that it is not clear what it means to divide by an interval that spans zero. We need to modify the provided div-interval routine to check for this condition and signal an error if it occurs.

Before we make the required modifications, let's take a closer look at the div-interval procedure to see what the problem is.

Exercise 2.11 suggests that by testing the signs of the endpoints of the intervals, we can break mul-interval into nine cases, only one of which requires more than two multiplications. We are to rewrite mul-interval using this suggestion.

The suggestion is based on the result of multiplication of two numbers with the same or opposite signs. For each interval there are three possibilities, both signs are positive, both are negative, or the signs are opposite. (Note that an interval with the signs [+, -] is not allowed, since the lower bound would be higher than the upper bound.) Since there are two intervals to check, that makes nine possibilities. All nine possibilities are listed below.

[+, +] * [+, +][+, +] * [-, +][+, +] * [-, -]

[-, +] * [+, +][-, +] * [-, +][-, +] * [-, -]

[-, -] * [+, +][-, -] * [-, +][-, -] * [-, -]

For most of the combinations above, we can see directly which pairs need to be multiplied to form the resulting interval. For example, if all values are positive, then multiplying the two upper bounds and two lower bounds are the only two products we need to find. The only case where we need to do more than two multiplications is in the case [-, +] * [-, +], since the product of the two lower bounds could possibly be larger than the product of the two upper bounds.

Note that the following code is much less readable and would be harder to maintain than the original procedure. In addition to that, without benchmarking both procedures we aren't even sure which one is faster. The trade-off in maintainability certainly doesn't seem to be worth any potential savings you might get from eliminating two multiplications from most cases. The developer who suggested this enhancement should probably have their commit access revoked (and be shot out of a cannon).