Saturday, December 26, 2009

Math visualization: (x + 1)2

You may remember this from high-school algebra (or perhaps earlier for some). Expand (x + 1)2 using the FOIL method.

(x + 1)2 = (x + 1)(x + 1)
= x2 + x + x + 1
= x2 + 2x + 1

A neat way to visualize this equality, and hopefully help remember the factorization of the resulting polynomial, is to look at how the pieces fit together.


When they're arranged in this way, it's easy to see that the pieces squeeze together to form a larger square that has a length of (x + 1) on a side, proving the equality.


Related posts:

Six Visual Proofs
Multiplying Two Binomials

Thursday, December 24, 2009

SICP Exercise 1.15: Calculating sines

From SICP section 1.2.3 Orders of Growth

Exercise 1.15 shows an interesting procedure for computing the sine of an angle (in radians) by combined use of the approximation sin x ~ x (if x is sufficiently small) and the trigonometric identity

sin r = 3 * sin(r/3) - 4 * sin3(r/3)

which is used to reduce the argument of sin. The code provided implements the computation:
(define (cube x) (* x x x))

(define (p x) (- (* 3 x) (* 4 (cube x))))

(define (sine angle)
(if (not (> (abs angle) 0.1))
angle
(p (sine (/ angle 3.0)))))

The exercise goes on to ask the following two questions:

a. How many times is the procedure p applied when (sine 12.15) is evaluated?

b. What is the order of growth in space and number of steps (as a function of a) used by the process generated by the sine procedure when (sine a) is evaluated?

Looking at how the process executes will help us answer both questions.
(sine 12.15)
(p (sine 4.05))
(p (p (sine 1.35)))
(p (p (p (sine 0.45))))
(p (p (p (p (sine 0.15)))))
(p (p (p (p (p (sine 0.05))))))
(p (p (p (p (p 0.05)))))
(p (p (p (p 0.1495))))
(p (p (p 0.4351345505)))
(p (p 0.9758465331678772))
(p -0.7895631144708228)
-0.39980345741334

As long as the angle stays greater than 0.1, the process keeps recursing. There are five calls to the procedure p before that happens.

Complexity

The sine procedure is recursive, but it makes only a single call to itself, so it isn't tree recursive and its complexity isn't nearly as bad as some of the procedures we've seen in previous exercises. The complexity for this procedure in both space and number of steps is actually better than linear.

Space - The amount of space required by this procedure would increase linearly as the input size is tripled. Only the calls to procedure p are deferred, and we can triple the size of the input before an additional call would be made.

Number of steps - It's also easy to see that we can triple the value of the starting input angle and only one more step would be required by this procedure.

The complexity of both the space and the number of steps required by this procedure can be described as O(log n).


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Tuesday, December 22, 2009

SICP Exercise 1.14: Counting change

From SICP section 1.2.3 Orders of Growth

Exercise 1.14 asks us to draw the tree illustrating the process generated by the count-change procedure presented in section 1.2.2 in making change for 11 cents.
(define (count-change amount)
(cc amount 5))

(define (cc amount kinds-of-coins)
(cond ((= amount 0) 1)
((or (< amount 0) (= kinds-of-coins 0)) 0)
(else (+ (cc amount
(- kinds-of-coins 1))
(cc (- amount
(first-denomination kinds-of-coins))
kinds-of-coins)))))

(define (first-denomination kinds-of-coins)
(cond ((= kinds-of-coins 1) 1)
((= kinds-of-coins 2) 5)
((= kinds-of-coins 3) 10)
((= kinds-of-coins 4) 25)
((= kinds-of-coins 5) 50)))

Running the code with the specified input shows us how many ways there are to make change for 11 cents with five coins.
> (count-change 11)
4
>

A quick scan of the code shows that the cc procedure contains two recursive calls to itself, resulting in the tree-shaped process structure mentioned in the question. Leaf nodes of the tree are reached when the base cases are met, that is when the amount is less than or equal to zero, or when the kinds of coins remaining reaches zero.

When a node splits, its left-hand child is passed the same amount and one fewer kinds of coins. The right-hand child is passed an amount that is reduced by the highest-denomination coin available to its parent node, and the same number of kinds of coins. (Click the images enlarge.)


I've color-coded the nodes of the tree. White nodes are those leaf nodes that evaluate to 0, blue nodes are those leaf nodes that evaluate to 1 and contribute to the final answer. The yellow nodes are those that branch recursively.


Orders of growth

The exercise also asks us to find the orders of growth of the space and number of steps used by the count-change process as the amount to be changed increases. (Note that we are not really concerned with the orders of growth as the number of kinds of coins grows, only the amount to be changed.)

Space - As we saw in the Fibonacci procedure in section 1.2.2, the space required by the count-change procedure grows linearly with the input. The space required is proportional to the maximum depth of the tree. At any point in the computation we only need to keep track of the nodes above the current node.

Since the height of the tree is proportional to the amount to be changed, the order of growth of the space required by the count-change procedure is O(n).

Number of steps - The diagram illustrates that this procedure, much like the recursive procedure for computing Fibonacci numbers that we saw before, in not very efficient. Much of the computation being done is repeated.

If we look closely at a call to cc where kinds-of-coins is equal to one, we can see each call generates two more calls until we reach a terminal node.


If we define the function T(a, k) to be the number of calls to cc generated by calling (cc n k), where n is the amount and k is the number of kinds of coins, then

T(n, 1) = 2n + 1

or in Big-O notation,

T(n, 1) = O(n)

Now let's take a look at a partial graph of a call to (cc 50 2).


There are two interesting things to point out here. First, there are n/5 calls to cc made where k = 2 kinds of coins (look down the right hand side of the diagram above). This is because each call is made with 5 cents less in the amount argument until a terminal node is reached. The second interesting thing is that each of these n/5 calls to (cc n 2) generates an entire (cc n 1) subtree.

That means that

T(n, 2) = (n/5) * 2n + 1

and because of the multiplication of terms

T(n, 2) = O(n2)

Similarly, if we look at the graph of (cc 50 3) we can see n/10 calls to cc where k = 3, each of which generates its own (cc n 2) subtree.


From this we find that

T(n, 3) = O(n3)

Finally, it's easy enough to show that we'll get the same pattern when k = 4 and k = 5. The n/50 nodes generated when k = 5 each generate n/25 nodes with k = 4, each of which generates a node where k = 3.

And so the final answer we arrive at for the time complexity of the count-change procedure with five kinds of coins is

T(n, 5) = O(n5)


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Friday, December 11, 2009

Unsolved: Goldbach conjecture

The Goldbach conjecture is one of the oldest and easiest-to-understand math problems that remains unsolved. The problem was originally posed to Leonhard Euler in a letter from amateur mathematician Christian Goldbach in 1742. The original form of the conjecture, now known as the weak Goldbach conjecture says:
Every whole number greater than five is the sum of three prime numbers.

Euler restated the problem in an equivalent form, the strong Goldbach conjecture (or just the Goldbach conjecture):
Every even number greater than two is the sum of two primes.

So,
4 = 2 + 2
6 = 3 + 3
8 = 3 + 5
10 = 5 + 5
12 = 5 + 7
...
100 = 47 + 53
...


Progress

In 1966, Chinese mathematician Chén Jǐngrùn showed that every sufficiently large even integer is the sum of a prime and semiprime (a number that has at most two prime factors).

Brute force methods have been used to show that the Goldbach conjecture is true for even integers up to about 1.5 * 1018* (1,500,000,000,000,000,000 or 1.5 billion billion).


Notes:


* Brute force results are as of July 24, 2009. See Goldbach conjecture verification for up-to-date results.

Wednesday, December 9, 2009

SICP Exercise 1.13: Fibonacci and the Golden ratio

From SICP section 1.2.2 Tree Recursion

Exercise 1.13 asks us to prove that Fib(n) is the closest integer to φn/√5, where

φ = (1 + √5)/2.

The exercise also gives us the following hint:
Let ψ = (1 - √5)/2. Use induction and the definition of Fibonacci numbers to prove that

Fib(n) = (φn - ψn) / √5

Some may recognize that this question is closely related to the closed form expression for Fibonacci numbers, also known as Binet's formula.

You may also recognize the quantity (1 + √5)/2, denoted by the Greek letter φ (phi), as the Golden ratio.

This mathematical constant has many interesting properties, a couple of which will be useful in our proof.
φ2 = φ + 1
1/φ + 1 = φ

The second constant introduced in the hint (1 - √5)/2, denoted by the Greek letter ψ (psi), shares the same properties.

ψ2 = ψ + 1
1/ψ + 1 = ψ


Before tackling the problem, let's take a look at the sequences in question to try and get a feel for what we're trying to prove. It's safe to assume that everyone is familiar with the Fibonacci sequence:

0, 1, 1, 2, 3, 5, 8, 13, 21, ...

Each number is the sum of the preceding two.

We can use the following procedures to quantify φn/√5 for several values of n.
(define phi
(/ (+ 1 (sqrt 5)) 2))

(define (^ base exponent)
(define (*^ exponent acc)
(if (= exponent 0)
acc
(*^ (- exponent 1) (* acc base))))
(*^ exponent 1))

(define (f n)
(/ (^ phi n) (sqrt 5)))

The following output shows that this function does track closely with the Fibonacci sequence.
> (f 0)
0.4472135954999579
> (f 1)
0.7236067977499789
> (f 2)
1.1708203932499368
> (f 3)
1.8944271909999157
> (f 4)
3.0652475842498528
> (f 5)
4.959674775249769
> (f 6)
8.024922359499621
> (f 7)
12.984597134749391
> (f 8)
21.009519494249012

Each term is within 1/2 of the corresponding term of Fib(n), which is what we're asked to prove. This is only a small amount of empirical evidence, though, which is a far cry from proof. It does show that what we're asked to prove is at least reasonable, though. (It's always a good idea to show that an assertion is at least reasonable before you go about trying to prove it.)

The inductive proof

As the hint recommends, let's start by proving inductively that

Fib(n) = (φn - ψn) / √5

For the base cases we'll show that the left-hand side of the equation above is equal to the right-hand side for n = 0, n = 1, and n = 2.

On the Left-hand side we have:

Fib(0) = 0
Fib(1) = 1
Fib(2) = Fib(1) + Fib(0)
= 1 + 0
= 1

On the Right-hand side we have:

for n = 0
n - ψn) / √5 = (φ0 - ψ0) / √5
= (1 - 1) / √5
= 0 / √5
= 0

for n = 1
n - ψn) / √5 = (φ1 - ψ1) / √5
= ((1 + √5)/2) - ((1 - √5)/2) / √5
= (1/2 + √5/2) - (1/2 - √5/2) / √5
= (1/2 + √5/2 - 1/2 + √5/2) / √5
= (√5/2 + √5/2) / √5
= (2 * √5/2) / √5
= √5 / √5
= 1

for n = 2
n - ψn) / √5 = (φ2 - ψ2) / √5
= ((φ + 1) - (ψ + 1)) / √5
= (((1 + √5)/2 + 1) - ((1 - √5)/2 + 1)) / √5
= ((1 + √5)/2 + 1 - (1 - √5)/2 - 1) / √5
= ((1 + √5)/2 - (1 - √5)/2) / √5
= ((1 + √5) - (1 - √5)) / 2 / √5
= (1 + √5 - 1 + √5) / 2 / √5
= (√5 + √5) / 2 / √5
= (2 * √5) / 2 / √5
= √5 / √5
= 1

So far, so good. Now for the inductive step. If we assume that both of the following are true:

Fib(n) = (φn - ψn) / √5
Fib(n-1) = (φn-1 - ψn-1) / √5

does it follow that

Fib(n+1) = (φn+1 - ψn+1) / √5

is true also? Let's find out!

We'll start from the defining recurrence relation of the Fibonacci sequence and see if the two assumptions above can lead us to the correct conclusion. Remember that we also have the properties of φ and ψ that were introduced at the beginning at our disposal.

Fib(n+1) = Fib(n) + Fib(n-1)
= (φn - ψn) / √5 + (φn-1 - ψn-1) / √5
= ((φn - ψn) + (φn-1 - ψn-1)) / √5
= (φn - ψn + φn-1 - ψn-1) / √5
= (φn + φn-1 - ψn - ψn-1) / √5
= ((φn + φn-1) - (ψn + ψn-1)) / √5
= (φn+1 * (φ-1 + φ-2) - ψn+1 * (ψ-1 + ψ-2)) / √5
= (φn+1 * φ-1 * (1 + φ-1) - ψn+1 * ψ-1 * (1 + ψ-1)) / √5
= (φn+1 * 1/φ * (1 + 1/φ) - ψn+1 * 1/ψ * (1 + 1/ψ)) / √5
= (φn+1 * 1/φ * (φ) - ψn+1 * 1/ψ * (ψ)) / √5
= n+1 - ψn+1) / √5

In the 10th step of the transformation above I used the following properties of φ and ψ to do substitutions:

1/φ + 1 = φ
1/ψ + 1 = ψ

This concludes the inductive proof from the hint, but where does that leave us? How does that help us prove that Fib(n) is the closest integer to φn/√5?

One proof leads to another

Let's rearrange the equation just a bit before continuing on.

Fib(n) = (φn - ψn) / √5
Fib(n) = φn/√5 - ψn/√5
Fib(n) - φn/√5 = ψn/√5

I did this bit of manipulation because we're trying to prove something about the relationship between Fib(n) and φn/√5. Specifically, we're trying to prove that the difference between them is always less than 1/2. In its current form, all that we have left to prove is that

ψn/√5 ≤ 1/2
or
ψn ≤ √5/2

Since ψ is defined to be (1 - √5)/2, we can simply evaluate it to find that

ψ = -0.618304...

Since the n in Fib(n) is always going to be an integer and

n ≥ 0

will always be true, and
ψ < 1

we know that
ψn ≤ 1

for all non-negative integer values of n.

We can also simply evaluate √5/2.

√5/2 = 1.118...

Since
ψn ≤ 1
and
√5/2 > 1

it's pretty clear that

ψn ≤ √5/2

and therefore

Fib(n) is the closest integer to φn/√5.

QED


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Wednesday, November 25, 2009

SICP Exercise 1.12: Pascal's Triangle

From SICP section 1.2.2 Tree Recursion

Exercise 1.12 asks us to write a procedure that computes the elements of Pascal's triangle by means of a recursive process.



The numbers along the left and right edges of the triangle are all 1, and each number inside the triangle is the sum of the two numbers directly above it.

We'll design our procedure to take the row and column numbers as arguments, and return the value of the element in that position.
(pascal row col)

We start counting both rows and columns at 0, as pictured in the diagram. So the procedure call
(pascal 4 2)
should return a value of 6, since that is the value in the 4th row and 2nd column.

The first thing we should note is that the nth row of the triangle contains exactly n columns. This means that we shouldn't expect to call our procedure with a column argument greater than the row argument. So, the following call would be expected to return nonsense information:
(pascal 3 4)

We should also note that the 0th and nth column of any row should return 1. This is enough to get us started.
(define (pascal row col)
(cond ((< row col) #f)
((or (= 0 col) (= row col)) 1)
(else #t)))

Here we're just defining the initial logic of the procedure. If the row argument is less than the column, we simply return #f (probably not the best error message I've ever written, but it's not the focus of the exercise). If the column is the 0th or nth column, we return 1, otherwise we return a value of #t. This last value is just a placeholder which we'll need to replace with a calculation of the correct Pascal number.

To finish up the procedure, we need to recursively calculate values from the interior of the triangle. An interior value is defined as the sum of the two values directly above in the triangle. So, for example, to calculate Pascal(4, 2) we would sum Pascal(3, 1) and Pascal(3, 2). Each of these values (since there are both interior values) would be calculated in the same manner, but the recursive property of our procedure will take care of that for us.




The two values to sum in order to calculate Pascal(row, column) are always going to be the value in the previous row and same column, Pascal(row-1, column), and the value in the previous row and previous column, Pascal(row-1, column-1). Translating this information to Scheme and adding it to our earlier procedure, we get our final answer to the exercise:
(define (pascal row col)
(cond ((< row col) #f)
((or (= 0 col) (= row col)) 1)
(else (+ (pascal (- row 1) col)
(pascal (- row 1) (- col 1))))))

Try several examples from Pascal's triangle illustration to verify that the results are correct. Note that since this procedure defines a recursive process, calculating larger values can be extremely time consuming. It takes a noticeable amount of time (a few seconds) on my machine to calculate values in the 25th row.

For more fun with Pascal's triangle see Pascal's Triangle and its Patterns.


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Saturday, November 21, 2009

Unsolved: Collatz conjecture

First proposed in 1937 by the German mathematician Lothar Collatz, the Collatz conjecture is also known as the 3n + 1 problem, Kakutani's problem, the Syracuse problem, Thwaites' conjecture, and Ulam's conjecture. The conjecture starts with the following simple procedure:
  1. Let n be any natural number (n > 0).
  2. If n is odd, triple it and add 1 (n = 3n + 1).
  3. If n is even divide it in half (n = n/2).
  4. Stop if n = 1, otherwise go back to step 2.
The Collatz conjecture asks:
Does the above process always terminate (end with n = 1) for any starting value of n?
For example, starting with a value of n = 42, we get the following sequence

{ 42, 21, 64, 32, 16, 8, 4, 2, 1 }

The sequences produced by the Collatz procedure are known as hailstone sequences.

The conjecture remains unanswered, although it has been proven that the process terminates for all values of n up to 5.764 × 1018. In 1972, John Conway proved that it is possible for problems of this type to be undecidable, so it is not known if a solution is even possible.

Wednesday, November 18, 2009

SICP Exercise 1.11

From SICP section 1.2.2 Tree Recursion

Exercise 1.11 gives us a function defined by the rules that

f(n) = n if n < 3
f(n) = f(n - 1) + 2f(n - 2) + 3f(n -3) if n ≥ 3

and asks us to write both recursive and iterative procedures for the function.

Since the function is defined recursively, it's any easy matter of a straight translation to Scheme to come up with the recursive definition.
(define (f n)
(if (< n 3)
n
(+ (f (- n 1))
(* 2 (f (- n 2)))
(* 3 (f (- n 3))))))

Here are the results of several sample runs:
> (f 0)
0
> (f 1)
1
> (f 2)
2
> (f 3)
4
> (f 4)
11
> (f 5)
25
> (f 6)
59
> (f 7)
142

The procedure above is a recursive process because the calculation of some of the state information is deferred until a later call of the procedure. Defining an iterative procedure means we have to remove this property. That means the iterative procedure has to keep track of all state information at each step of the calculation in explicit variables.

We can draw inspiration from the iterative definition of the Fibonacci procedure from the text.
(define (fib n)
(fib-iter 1 0 n))

(define (fib-iter a b count)
(if (= count 0)
b
(fib-iter (+ a b) a (- count 1))))

Here the fib procedure makes a call to the fib-iter procedure with initial values passed in for the first two parameters, along with a counter. We can use the same pattern for our iterative definition of f and f-iter.
(define (f n)
(if (< n 3)
n
(f-iter 2 1 0 n)))

(define (f-iter a b c count)
(if (< count 3)
a
(f-iter (+ a (* 2 b) (* 3 c))
a
b
(- count 1))))

In this definition of (f n), we start with the same check as before. If n is less than 3, then f(n) = n, so we need go no further. Otherwise, we iterate for as long as necessary.

f-iter takes three initial values a, b, and c, plus a counter. When counter reaches a value less than 3, then a will be equal to f(n), b will be equal to f(n-1), and c will be equal to f(n-2). Each time we iterate the value of a is recalculated and the old values of a and b are passed into the procedure for the parameters b and c. The counter is also decremented by 1.

You can copy and paste the code above into your Scheme interpreter to verify that we get the same results as the sample runs above.


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Wednesday, November 11, 2009

Russian Roulette

Here's a nice little probability problem that I first saw in David Darling's Universal Book of Mathematics:

You're in a game of Russian Roulette (with only one other player), but instead of one bullet the gun (a six-shooter revolver) is loaded with three bullets in three consecutive chambers. The barrel is spun only once. Each player points the gun at his head and pulls the trigger. If he is still alive, the gun is passed to the other player. The game ends when one player dies. Would you stand a better chance of surviving if you shot first or second, or does it make a difference?

This puzzle, or a variation of it, is often used as a brain teaser interview question. To avoid spoiling it for anyone, I'll add the solution as a comment to this post (so don't scroll down until you think you've worked it out).

Tuesday, November 10, 2009

Unsolved: The Perfect Cuboid Problem

In Tomorrow's Math, C. Stanley Ogilvy asks
Does there exist a rectangular parallelepiped (box) all of whose edges and face diagonals are of integral length, and the length of whose main diagonal is also an integer?
Put another way, can you construct a rectangular solid where all three edge lengths, all three face diagonals, and the long space diagonal have integer values (or prove that such a box cannot be constructed)?



This deceptively simple-sounding problem is known as the Perfect Cuboid Problem (among other names). It is closely related to the problem of finding an Euler Brick, which has many solutions, the smallest of which is a box with edges 125, 244, and 267.

No perfect cuboids were found during an exhaustive search of the integers up to 100 billion (1010). This lends evidence (but is far from a proof) to the suspicion that no perfect cuboids exist. One near miss is the cuboid that has edges with lengths 44, 117, and 240, and diagonals with lengths 125, 244, and 267.

Sunday, November 8, 2009

SICP Exercises 1.9 and 1.10

Exercises from section 1.2.1 Linear Recursion and Iteration

Exercise 1.9 asks us to use the substitution model to illustrate the processes generated by the following two procedures:
(define (+ a b)
   (if (= a 0)
       b
       (inc (+ (dec a) b))))

(define (+ a b)
   (if (= a 0)
       b
       (+ (dec a) (inc b))))
You may recognize these procedures as the two Peano Arithmetic examples from Lecture 1B. As we saw in the lecture video, the first procedure produces a linear recursive process, while the second produces an iterative process (the two procedures are given here in the reverse order from the lecture).

Remember also from that lecture that the steps for evaluating an application are:
  • Evaluate the operator to get procedure
  • Evaluate the operands to get arguments
  • Apply the procedure to the arguments
  • Copy the body of the procedure substituting the arguments supplied for the formal parameters of the procedure.
  • Evaluate the resulting new body.

If we are to evaluate (+ 4 5) using the first procedure we'll see that the increment operations are deferred.
(+ 4 5)
Evaluate the operator to get the procedure
(if (= a 0)
    b
    (inc (+ (dec a) b)))

Evaluate the operands to get arguments
(if (= 4 0)
    5
    (inc (+ (dec 4) 5)))

Since 4 is in fact not equal to 0, this expression evaluates to
(inc (+ (dec 4) 5))

This is where things begin to get interesting. We can evaluate (dec 4), but then we need to evaluate the recursive call to + before we're able to evaluate the inc operation. That means the inc operation is deferred until we return from the recursive call.

Greatly simplified, the expansion of this expression proceeds as follows:
(inc (+ 3 5))
(inc (inc (+ (dec 3) 5)))
(inc (inc (+ 2 5)))
(inc (inc (inc (+ (dec 2) 5))))
(inc (inc (inc (+ 1 5))))
(inc (inc (inc (inc (+ (dec 1) 5)))))
(inc (inc (inc (inc (+ 0 5)))))

At this point, the innermost expression, (+ 0 5), can be evaluated because the first operand is 0, which is a the base case of the recursive definition. The process will now contract until we have our answer.
(inc (inc (inc (inc 5))))
(inc (inc (inc 6)))
(inc (inc 7))
(inc 8)
9

Evaluating (+ 4 5) with the iterative procedure is much more straight forward, as we should expect. No operations are deferred, so we don't see the expand-and-contract process shape that is the signature of a recursive process.
(+ 4 5)
(+ (dec 4) (inc 5))
(+ 3 6)
(+ (dec 3) (inc 6))
(+ 2 7)
(+ (dec 2) (inc 7))
(+ 1 8)
(+ (dec 1) (inc 8))
(+ 0 9)
9


Exercise 1.10 gives the following definition for Ackermann's function (note that this is a different definition than the one given in the Wikipedia entry for the Ackermann function. The analysis that follows is for the procedure given in the text.):
(define (A x y)
   (cond ((= y 0) 0)
         ((= x 0) (* 2 y))
         ((= y 1) 2)
         (else (A (- x 1)
                  (A x (- y 1))))))

The next part of the exercise asks us to evaluate a few expressions that use the function. Our Scheme interpreter is the best tool for this job.
> (A 1 10)
1024
> (A 2 4)
65536
> (A 3 3)
65536

Next, we're asked to give concise mathematical definitions for the following functions defined in terms of A:
(define (f n) (A 0 n))

(define (g n) (A 1 n))

(define (h n) (A 2 n))

I'll take them one at a time. If we plug a few values in for (f n) we can see how the function grows.
> (f 0)
0
> (f 1)
2
> (f 2)
4
> (f 3)
6
> (f 4)
8

It's pretty clear from these results that

f(n) = 2n

which is what we should expect from examining the given procedures.
(define (f n) (A 0 n))
(f n) just calls A with the x argument set to 0. The procedure for A has a condition defined just for this input.
((= x 0) (* 2 y))

Next, plugging in a few values for (g n) we get:
> (g 0)
0
> (g 1)
2
> (g 2)
4
> (g 3)
8
> (g 4)
16

This sequence is growing exponentially.

g(n) = 2n, when n > 0
g(n) = 0, when n = 0


Finally, plugging in values for (h n) we get:
> (h 0)
0
> (h 1)
2
> (h 2)
4
> (h 3)
16
> (h 4)
65536

This one had me scratching my head for a few minutes. If you evaluate (h 5) you'll see that it's a number so large that I won't even try to describe it. I didn't see a pattern in these results until I wrote them out in a slightly different way.

h(0) = 0
h(1) = 2 = 21
h(2) = 4 = 22
h(3) = 16 = 24
h(4) = 65536 = 216

From these results I noticed that

h(n) = 2h(n-1), when n > 1
h(1) = 2
h(0) = 0


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Monday, November 2, 2009

SICP - Notes from Lecture 1B

Procedures and Processes:
Substitution Model
Covers Text Section 1.2

You can download the video lecture or stream it on MIT's OpenCourseWare site. It's also available on YouTube.

This lecture is presented by MIT’s Gerald Jay Sussman.

Part 1

This lecture concentrates on understanding how patterns of procedures and expressions cause particular behaviors and patterns of execution. In particular, we’re going to look at how different patterns of writing procedures can be used to write iterative and recursive processes. This was touched on in the previous lecture, but will be explored in much more depth here.

The substitution model is illustrated using the following definition for computing the sum of squares:
(define (sos x y)
(+ (sq x) (sq y)))

(define (sq x)
(* x x))

Substitution Rule

To evaluate an application:
  • Evaluate the operator to get procedure
  • Evaluate the operands to get arguments
  • Apply the procedure to the arguments
  • Copy the body of the procedure substituting the arguments supplied for the formal parameters of the procedure.
  • Evaluate the resulting new body.

When evaluating the operands, it doesn't really matter what order you do it in.

Key Quotes:
The key to Understanding complicated things is to know what not to look at, and what not to compute, and what not to think.

One of the things we have to learn how to do is ignore details.

Using the substitution rule (which is stressed to be an inaccurate model that we will refine later) and ignoring as much irrelevant detail as possible, we can show how the sum of squares is calculated.

(sos 3 4)
(+ (sq 3) (sq 4))
(+ (sq 3) (* 4 4))
(+ (sq 3) 16)
(+ (* 3 3) 16)
(+ 9 16)
25

In the first step we evaluate the sos operator to get its procedure. We substitute the values of the parameters into the definition. We can treat the + operator as primitive, so there’s no need to evaluate it. Next we need to evaluate the two sq operators. It doesn’t matter which order we evaluate them in. Each of these evaluations follows the same rule. We evaluate the operators to get the sq procedures, substituting the values of the arguments. At each step, we evaluate an expression as soon as it is reduced to primitives. Remember from the last lecture that this is called applicative-order evaluation. (This is in contrast to normal-order evaluation, where we would do the substitution of the expressions for the operands inside the body first. "Fully expand and then reduce.")


The IF Expression – A special form

One exception to our evaluation rules is the IF expression.

To evaluate an IF expression
  • Evaluate the predicate expression
  • If it yields TRUE
  • evaluate the consequent expression
  • otherwise
  • evaluate the alternative expression

The key here is that only one of the consequent or alternative expressions will ever be evaluated. This is different from the normal evaluation rule, which is why conditionals are termed a special form.

Peano Arithmetic (part 1)

The following procedure is given as an example of one way that addition could be defined in Scheme.
(define (+ x y)
(if (= x 0)
y
(+ (dec x) (inc y))))
Note: In the lecture, Prof. Sussman uses the 1+ and -1+ operators for increment and decrement. For readability, I've replaced them with inc and dec.

In this example we are counting down x, while counting up y. The real-world analogy given in the lecture is to imagine that you have two piles of stones, one with x number of stones, the other with y stones. With this definition for the + operator, we're just taking stones from the x pile and placing them in the y pile one at a time. When the size of the x pile reaches 0, the y pile has all the stones and we return that as the answer.


Part 2


During the first few minutes of this part of the lecture Professor Sussman takes a few moments to explain the idea of test strips in photography. He explains that a good photographer needs to be able to visualize how a photograph will look before they take a photo. In order to develop this ability, a photographer will take a series of photos under slightly different conditions in order to see what different effects are produced by each small change.

This introduces the concept of simple perturbational analysis in programming. The idea is basically the same as a photographer’s test strips. We make slight changes to a program and observe the change in output from before and after the change. By this method we can learn what kinds of program structure produce what kinds of effects in our processes.

Peano Arithmetic (part 2)

The following alternative procedure for addition is given:
(define (+ x y)
(if (= x 0)
y
(inc (+ (dec x) y))))

This alternative definition for the + operator is the same in all respects but one. In the last line of the procedure, all increment operations are deferred until all of the decrement operations have been completed.

In this example, you still remove the stones from the x pile one at a time. But instead of removing one stone from the x pile and immediately transferring it to the y pile, you hold each x stone in a buffer. Once all the stones are removed from the x pile, you then add them (again, one at a time) into the y pile. Incrementing the y pile is deferred until all decrements of the x pile are complete.

These two processes for addition have very different behaviors.

The complexities for the first procedure are

time = O(x)
space = O(1)

This is called an iterative process. The process will complete in an amount of time proportional to the size of x, and a constant amount of space is used.

The complexities for the second procedure are

time = O(x)
space = O(x)

This kind of process is called a linear recursion (linear because it is proportional to x in both time and space).

This brings up the difference between a recursive procedure and a recursive process. Both of the procedures for Peano Arithmetic that we've looked at have been defined recursively. A recursive procedure is simply one that is defined in terms of itself. A recursive process is one that defers some operation until a later stage of the process.

An iterative process carries all of its state in explicit variables. A recursive process keeps some of the state information as a deferred operation.


Part 3

The next part of the lecture starts with a recursive definition that most everyone will be familiar with, a procedure for generating Fibonacci numbers. You may remember that the Fibonacci sequence is generated by starting with 0 and 1, then the next number in the sequence is the sum of the previous two.

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, …

Mathematically, you can compute the nth Fibonacci number by summing the (n-1)th and (n-2)th Fibonacci numbers. When n is 0 or 1, the nth Fibonacci number is simply n. This leads directly to the following recursive definition for the nth Fibonacci term:
(define (fib n)
(if (< n 2)
n
(+ (fib (- n 1))
(fib (- n 2)))))
Can you come up with an iterative process to compute the same value? See section 1.2.2 Tree Recursion of the text for the iterative solution.

Note that the fib procedure above calls itself not once, but twice. This produces a different process structure than the recursive structure we’ve seen before. This pattern is called tree recursion, because for every call to fib, it branches into two more calls, which themselves branch into two more calls each, and so on until you reach the leaf nodes of the tree.



The complexities for this procedure are

time = O(2n)
space = O(n)

The time complexity for the Fibonacci procedure is exponential because for each increment you add to n, you very nearly double the size of the graph that is produced in the execution of the procedure. You can see in the graph above (taken from the text) that the entire subtree for calculating fib(4) would be repeated again if you were to expand the graph to compute fib(6).

The space complexity is only linear because as n increases by one, the height of the tree (which represents the number of deferred operations that need to stay in memory) also only increases by one.

Key Quote:
The way you construct a recursive process is by wishful thinking.

The lecture concludes with a demonstration of the Towers of Hanoi problem. Most of the time in this section is spent explaining the problem and manually demonstrating a solution with a real set of wooden pegs and disks. I won’t try to reproduce that here. It’s worth watching the lecture to see how a recursive solution is arrived at by “wishful thinking.” That is, Prof. Sussman builds a solution for moving n disks by assuming you can move (n – 1) disks, then recursively applying the procedure down to the point where only 1 disk needs to be moved.

The recursive solution is built on the assumption that a procedure to solve the problem already exists, but only for smaller instances of the problem. Sussman then decomposes the problem until he reaches the point where a solution does exist, moving one disk. This is the general solution for solving problems recursively.

Saturday, October 31, 2009

Happy Halloween

You know you're a giant nerd when you're reminded of major holidays by your Scheme interpreter.

Jack-o-Lambda
I'm working on transcribing my notes from the next SICP lecture, so I'll have a real post up soon. Happy Halloween everyone!

Thursday, October 15, 2009

SICP Exercises 1.6 - 1.8

Exercises from section 1.1.7 Example: Square Roots by Newton's Method of SICP.

Exercise 1.6 gives the following alternative definition to the if special form:
(define (new-if predicate then-clause else-clause)
(cond (predicate then-clause)
(else else-clause)))
The new-if procedure seems to work with simple conditional statements on the command line, but if you then define the sqrt-iter procedure in terms of new-if, you notice something peculiar. The sqrt-iter procedure defined in terms of new-if never finishes executing.

To explain why, we need to recall that if is a special form. From section 1.1.6 of the text,
To evaluate an if expression, the interpreter starts by evaluating the "predicate" part of the expression. If the "predicate" evaluates to a true value, the interpreter then evaluates the "consequent" and returns its value. Otherwise it evaluates the "alternative" and returns its value.

In the if special form, after the predicate is evaluated, only one of the consequent or alternative will be evaluated. Never both. The new-if procedure that we've defined doesn't have this property. Our interpreter will use regular old applicative-order evaluation when it encounters the new-if procedure. Since the else-clause passed to new-if in the sqrt-iter procedure is always evaluated, sqrt-iter never stops making recursive calls to itself.


Exercise 1.7 points out a flaw in the implementation of the sqrt procedure when you try to find the square root of very small or very large numbers.

The problem is easy to spot for very small numbers. Try evaluating the following:
(sqrt 0.001)
and you should get a wrong answer. Instead of 0.0316227766 as you might expect, our sqrt procedure gives an answer of 0.0412454261. The problem is that as we refine our guess, we stop as soon as the square of our guess is within 0.001 of x. Squaring 0.0412454261we get 0.00170118517, which we can see is within our tolerance of 0.001.

For very large numbers the problem is a bit more difficult to pin down. If we try to evaluate the square root of a large enough number, we find that the interpreter never returns a value. For example, evaluating the following expression
(sqrt 9999999999998)
will never return. (I'm using PLT Scheme with the "Module" language selected for the exercises in this section. You may have different results.)

What might cause that? The answer is due to the nature of floating-point numbers. As values get larger and larger, their floating-point representation becomes less precise. As we keep recursively refining our guess in the square root procedure, if the value of the guess is large enough we're unable to represent it to within our tolerance of 0.001. This causes an endless sequence of recursive calls, since the interpreter will never reach a point where the guess is good enough.

The solution to this problem, as suggested in the question itself, is to redefine our good-enough? procedure so that it stops when our guess changes by a very small amount.

(define (square x)
(* x x))

(define (average x y)
(/ (+ x y)
2))

(define (good-enough? guess previous-guess)
(< (abs (- guess previous-guess))
0.001))

(define (improve guess x)
(average guess
(/ x guess)))

(define (sqrt-iter guess previous-guess x)
(if (good-enough? guess previous-guess)
guess
(sqrt-iter (improve guess x)
guess
x)))

(define (sqrt x)
(sqrt-iter 1.0 0 x))
Note that the good-enough-procedure no longer takes x as an input parameter, but instead decides whether or not a guess is good enough based on the change from the previous guess. This change also called for a change to the sqrt-iter procedure, which must now keep track of the previous guess. Finally, note that when I first call the sqrt-iter procedure in sqrt, I pass in values of 1.0 and 0 for the initial guess and the previous guess. This ensures that the difference will be greater than our tolerance for at least one recursive call of sqrt-iter.


Exercise 1.8 gives us the following formula for improving a guess when using Newton's method for computing a cube root.


This is analogous to averaging our guess with x/guess in order to refine our guess in the square root procedure. In order to compute the cube root, we can replace the square root improve procedure with the following definition.
(define (improve guess)
(/ (+ (/ x
(square guess))
(* 2 guess))
3))
Putting it all together (starting from the improved version of sqrt that we wrote in the previous exercise) we get:
(define (square x)
(* x x))

(define (good-enough? guess previous-guess)
(< (abs (- guess previous-guess))
0.001))

(define (improve guess x)
(/ (+ (/ x
(square guess))
(* 2 guess))
3))

(define (cbrt-iter guess previous-guess x)
(if (good-enough? guess previous-guess)
guess
(cbrt-iter (improve guess x)
guess
x)))

(define (cbrt x)
(cbrt-iter 1.0 0 x))

We can test it out with the values that gave us problems before, with the following results:
> (sqrt 0.001)
0.03162278245070105
> (sqrt 99999999999998)
9999999.9999999

Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Monday, October 12, 2009

Web developers, please stop stealing focus.

Some sites use a very simple bit of JavaScript to put the focus in a specific form element once the page is done loading. Google does this, as was pointed out by Patrick Altoft in document.f.q.focus(); The Billion Dollar Line of JavaScript. When your browser loads the Google home page, this snippet of code automatically places your keyboard's focus in the main search box.

This makes a lot of sense for Google. There's really only one form element on Google's home page. That's where the focus should go. There's really no harm in placing the focus there, and as Patrick pointed out, it's a good source of revenue for Google. I believe that is what's termed a win-win.

You know where this doesn't make sense? Any page with a password field. When I load Twitter's (for just one example) main page, I'm greeted by a log in screen. The problem is that I have plenty of time to type in my username and tab to the password field before the page finishes loading and Twitter's version of the "Billion-dollar line of JavaScript" switches my focus back to the username field. This has caused me to type my password in plain text into the username field on several occasions. I do not care for this.

Granted, this is only a mild irritant, and now that I know Twitter is stealing my focus I can just change my behavior to compensate. The point is that I shouldn't have to. It would be a lot simpler to change the behavior of Twitter's login page than it would be to change the behavior of millions of users.

Possible solutions

The simplest (and I think the best) solution is for everyone to simply stop adjusting page focus using JavaScript. Unless your form has exactly one text field (and I can only think of a few of these), you really shouldn't assume you know where focus should go. Let the user handle it.

Another possibility is to check what field currently has focus before you change it. If the user is already using your page, just let it go. If none of your form elements currently has focus, go ahead and suggest a starting point by focusing on the first form element.

Finally, as a compromise between the two solutions above (and this is my recommended solution), just don't steal focus on any page that contains a password field. If there's a possibility of revealing the user's login credentials to passers-by, then just don't take the risk.

Wednesday, October 7, 2009

SICP Exercises 1.1 - 1.5

Here are the answers to (most of) the exercises presented in section 1.1.6 Conditional Expressions and Predicates of Structure and Interpretation of Computer Programs.

Exercise 1.1 asks us to evaluate a series of simple Scheme expressions. I strongly advise anyone working along to input each expression to a Scheme interpreter to see how they evaluate for yourself.


Exercise 1.2 asks us to translate the following formula to prefix notation.


Checking with a calculator, I find that this expression evaluates to -37/150. The same expression in prefix notation would be
(/ (+ 5 4 (- 2 (- 3 (+ 6 (/ 4 5)))))
(* 3 (- 6 2) (- 2 7)))
You can check with a Scheme interpreter to verify that this expression also evaluates to -37/150.


Exercise 1.3 asks us to define a procedure that takes three numbers as arguments and returns the sum of the squares of the two larger numbers.

It's easy to define this procedure using the built-in Scheme procedures square, max, and min, as I did here. However, we haven't encountered these procedures yet, so I think we should define them as well.
(define (square x)
(* x x))

(define (max x y)
(if (> x y) x y))

(define (min x y)
(if (< x y) x y))


(define (sum-of-highest-squares x y z)
(+ (square (max x y))
(square (max (min x y) z))))
The sum-of-highest-squares works by adding the square of the maximum of x and y (not the lowest of the three) and the square of the maximum of the remaining two (the minimum of x and y, which will be whichever value was left over from the first step), and z.


Exercise 1.4 asks us to describe the behavior of the following procedure.
(define (a-plus-abs-b a b)
((if (> b 0) + -) a b))
Before the entire expression can be evaluated, the compound if expression must be evaluated so the interpreter can determine whether the + or - operator should be applied to the two operands a and b. If b i greater than 0, the + operator is applied, adding a positive value to a. Otherwise, the - operator is applied, subtracting a negative value. In either case, the result is the same. The absolute value of b is added to a.


Exercise 1.5 asks us to compare the behavior of the following two procedures when they're interpreted using applicative-order evaluation vs. normal-order evaluation.
(define (p) (p))

(define (test x y)
(if (= x 0)
0
y))
The first procedure simply makes a recursive call to itself. If you evaluate (p) directly in the Scheme interpreter, it will never return because each recursive call makes another recursive call with no base case defined. The test procedure is used to determine the evaluation order of the interpreter by running it with the recursive procedure as its second argument.

If the interpreter uses applicative-order evaluation (as we learned in section 1.1.5 of the text, is the case), the very first expression the interpreter sees
(test 0 (p))
has one operand and two arguments, all of which will be evaluated. The operand test will evaluate to the body of its procedure, the symbol 0 will evaluate to its value, and the operand (p) will evaluate to a recursive call to itself, which will never stop recursing.

If the interpreter uses normal-order evaluation, the (p) operand will not be evaluated until its value is needed. This is because when using normal-order evaluation the interpreter will substitute operand expressions for parameters. Since the conditional statement in the body is structured such that the second argument never needs to be evaluated, the entire test procedure will evaluate to 0 under normal-order evaluation.


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Sunday, October 4, 2009

SICP - Notes from Lecture 1A

Structure and Interpretation of Computer Programs
Lecture 1 - Overview and Introduction to Lisp
Covers Text Section 1.1

You can download the video lecture or stream it on MIT's OpenCourseWare site. It's also available on YouTube.

This first lecture is presented by MIT's Professor Hal Abelson.

Part 1

Prof. Abelson opens the lecture with the following “definition”

Computer Science – It's not really a science, and not about computers.

He supports the first half of this definition with the idea that computer science might be engineering or art, but that it is more akin to magic than science. A process can be thought of as like a magical spirit that lives in the computer. A pattern of rules called a procedure is like a magical spell, and the language that these spells are written in is Lisp.

These are the only references in this lecture to “computer science as magic.” Since a wizard graces the cover of the course text and the introductory slides, I hope the idea will be fleshed out as some point.

The second half of the definition, that computer science isn't about computers is supported with the following observations:
  • Physics is not really about particle accelerators.
  • Biology is not about microscopes and petri dishes.
  • Geometry (originally from the Egyptian words for “Earth” and “measure”) is not about surveying instruments.
These are all observations that the different fields of science we study should not be confused with the tools that we use to study them. I think Edsger W. Dijkstra said it best when he said.
Computer Science is no more about computers than astronomy is about telescopes.

Prof. Abelson gives the following examples to illustrate the distinction between declarative knowledge and imperative knowledge (what he terms “how-to” knowledge).

Declarative – The square root of x is the y such that
y2 = x and y ≥ 0

Imperative – In order to find the square root of x (using Heron of Alexandria's method of successive averaging):
  • Make a guess, G.
  • Improve the guess by averaging G and x/G.
  • Keep improving the guess until the it is good enough.

Declarative knowledge is about stating and cataloguing facts. Imperative knowledge is about the methods and processes for finding knowledge (both declarative and imperative). Imperative knowledge is the meta-language for producing new knowledge of both types.

Techniques for controlling the complexity of large systems is what computer science is really about.

The three main techniques for controlling complexity in computer science, and the three main topics for discussion in the course are going to be
  • Black Box Abstraction
  • Conventional Interfaces
  • Meta-Linguistic Abstraction (making new languages)

Black Box Abstraction – From an outside observer's point of view, the internals of how a procedure works should not be important. The example given in the lecture is the square root procedure from earlier. If we write a square root procedure, it should be modular in the sense that someone else can take it and use it as one step in their own procedures, and not need to worry about how it works.

Another reason to use Black Box Abstraction is that your imperative knowledge might be expressed in terms of a more general form of some other piece of imperative knowledge. The example given is a strategy for finding fixed points. Given a general procedure for finding fixed points, you can use it to find square roots by passing it a function as its data.

Black Box Abstraction
  • Primitive Objects
  • Primitive Procedures
  • Primitive Data
Means of Combination
  • Procedure composition
  • Construction of compound data
Means of Abstraction
  • Procedure Definition
  • Simple data abstraction
Capturing Common Patterns
  • Higher-order procedures
  • Data as procedures
A programming language has primitive data and procedures as its basic building blocks. A language needs a means for combining these data and procedures to form more complex data and procedures. A language needs a means for abstracting data and procedures so they can be used as components in more complex procedures. Lisp has a means for defining common patterns in terms of procedures and data. Higher-order procedures are those whose inputs and outputs are themselves procedures.

Conventional Interfaces – agreed-upon ways of plugging things together.
  • Generic Operations
  • Large-scale structure and modularity
  • Object-oriented programming
  • Operations on aggregates

Meta-linguistic Abstraction
  • Talks about how you construct new languages.
  • Looks at the process of interpretation and evaluation of code.


Part 2

In the first part of the lecture, Prof. Abelson revealed that, like chess, the rules of Lisp can be learned in a few minutes, but take years to master. The second part of the lecture talks about Lisp specifically, but more generally about the three main aspects of any programming language.

Primitive Elements – What does the language include in terms of data and procedures?
Means of Combination – How do you put the primitive elements together to build bigger things out of them?
Means of Abstraction – How do you use the combinations of primitive elements that you create as though they were primitive elements themselves?

Primitive Elements of Lisp: Numbers and mathematical operators make up the primitive elements in Lisp. Some examples are

3, 17.4, 5, +, -, *, and /

No real surprises here.

Means of Combination in Lisp:
A combination consists of applying an operator to some operands. Lisp uses prefix notation. The operator is written to the left of the operands. The parentheses make combinations unambiguous. A simple combination adding three numbers would be:
(+ 3 17.4 5)

The operands themselves can also be combinations instead of primitive data values.
(+ 3 (* 2 4) 7)

In the MIT Scheme interpreter environment, you hit C-x C-e (hold down the Control key while hitting x, then e) in order to evaluate an expression you've typed in. Evaluating the expression above gives a result of 18. This is the sum of 3, plus the product of 2 and 4, plus 7.

Prefix notation also makes it easy to show the structure of a program using a tree diagram. The following combination
(* (+ 2 (* 4 6))
(+ 3 5 7))
is represented by the following tree diagram (from section 1.1.3 of the course text, Evaluating Combinations)


Note that the nodes of the tree (including the non-leaf nodes) are represented by the value that the expression for that node evaluates to. If you start at the leaf nodes, the evaluation for the entire tree "percolates up" to the root node. This form of evaluation is a process known as tree accumulation.

Means of Abstraction in Lisp: Lisp uses the define keyword to define both variables (abstract data) and to apply names to procedures.
(define (square x)
(* x x))

You can then use the new square procedure just like any other operator in Lisp.
(square 10)

An alternative syntax for the definition of square above is:
(define square (lambda (x)
(* x x)))
Technically, the first syntax we looked at is the alternative, and this syntax is the more "correct" one. In Lisp, the lambda keyword is used to define a function. If you're giving a function a name, the alternative syntax we looked at first is a shorter way to do the same thing. Later on in the course we'll be looking at situations where we might want to create a function without explicitly giving it a name. We'll come back to the lambda operator then. For now, we'll use the short syntax for defining procedures.

The following function computes the average of x and y.
(define (average x y)
(/ (+ x y) 2))
The average of x and y is computed by dividing the sum of x and y by 2.

Now that the procedures for square and average are defined, you can use them just as you would any primitive procedure in Lisp. This is because Lisp does not make arbitrary distinctions between primitives of the language and procedures defined by the user.

The last section of the second part of the lecture explains how to do Case Analysis (Conditionals) in Lisp.

Consider the following definition of the absolute value function:

|x| = -x for x < 0
0 for x = 0
x for x > 0


The absolute value function could be implemented in Lisp using a conditional expression as follows:
(define (abs x)
(cond ((< x 0) -x))
((= x 0) 0)
((> x 0) x)))
Refer to section 1.1.6 of the text, Conditional Expressions and Predicates for a more complete discussion of conditionals, including the most common logical composition operators (and, or , and not) which are not covered in the lecture.


Part 3

We now know enough Lisp to implement any numerical procedure.

The main lecture concludes with an implementation of Heron of Alexandria's square root procedure seen earlier in the lecture.

To find an approximation to sqrt(x)
  • make a guess G
  • average G and x/G
  • keep improving the guess until it's good enough
  • use 1 as an initial guess

Each of these steps can be defined as a separate procedure, some of them in terms of other procedures we've seen already.

(define (try guess x)
(if (good-enough? guess x)
guess
(try (improve guess x) x)))

(define (sqrt x) (try 1 x))

(define (improve guess x)
(average guess (/ x guess)))

(define (good-enough? guess x)
(< (abs (- (square guess) x))
0.001))

Notice that the try procedure is defined partially in terms of itself. The try procedure defined above checks an initial guess, then calls itself with a refined guess if the initial guess is not good enough. This new guess is then checked to see if it is good enough, and the guess is defined again, and try will be called again with the new refined guess. This will go on until the guess is refined to the point where it is close enough. This is called a recursive definition. The ability to make recursive definitions allows you to do infinite computations that go on until a condition is reached (or just go on forever, if you make a mistake).


Summary

The following table shows what we've learned so far, and what we have yet to learn.







ProceduresData
Primitive
Elements
+ - * / =23 1.738
Means of
Combination
( )
cond if

Means of
Abstraction
define

We've learned the primitive elements for both data and procedures. We've learned how to combine functions by composing expressions. We've also learned how to abstract procedures by defining our own. We have yet to learn methods for combining and abstracting data. The next couple of lectures will be talking about making a link between the procedures that we write and the processes that happen in the machine, and how to use Lisp to not only make small computations (like we've done so far) but also about “general conventional methods of doing things.”