Sunday, October 31, 2010

SICP 2.6: Church numerals

From SICP section 2.1.3 What Is Meant by Data?

Exercise 2.6 introduces us to the idea that in a language that can manipulate procedures, we can get by without integers altogether. The implementation of 0 and the operation of adding 1 are given as
(define zero (lambda (f) (lambda (x) x)))

(define (add-1 n)
(lambda (f) (lambda (x) (f ((n f) x)))))

This representation is known as Church numerals, after its inventor, Alonzo Church, the logician who invented the λ calculus.

It's important to note that the definition for zero above does not evaluate to the integer 0, it is just a representation for 0.
> zero
#<procedure:zero>

A Church numeral is a procedure that takes one argument, and that argument is itself another procedure that also takes one argument. The procedure zero represents the integer 0 by returning a procedure that applies its input procedure zero times. The Church numeral one should return a procedure that applies its input procedure once, two should return a procedure that applies its input procedure twice, and so on. This is how Church numerals represent integers, by applying their input procedure the exact number of times that the numerals are supposed to represent.

Our job in this exercise is to define one and two directly (not in terms of zero and add-1). We'll do that by using substitution to evaluate (add-1 zero) to see what the resulting procedural definition should be. We also need to give a direct definition for addition of two Church numerals (not in terms of repeated application of add-1).

Before we begin, we'll need some method for testing Church numerals to make sure they do what we expect them to do. We'll need a simple procedure to pass to each Church numeral as an argument so that we can see it gets evaluated the correct number of times. The inc procedure should do nicely. (You can replace inc with any procedure you like, just make it one that's easy to mentally calculate several repetitions of.)
(define (inc n)
(+ n 1))

We can use inc to show that zero does not call its input parameter, or more precisely, it calls its input procedure zero times.
> ((zero inc) 0)
0
> ((zero inc) 1)
1
> ((zero inc) 2)
2

Now let's define one and two in the way that the book warns us not to, just to make sure they behave as we expect. I'll give the assigned implementations later.
> (define one (add-1 zero))
> (define two (add-1 one))
> ((one inc) 0)
1
> ((one inc) 1)
2
> ((two inc) 0)
2
> ((two inc) 1)
3

So we can see that the procedure returned by one applies inc one time, and the procedure returned by two applies it twice. Now let's use substitution to evaluate (add-1 zero), giving us the direct definition of the Church numeral one.
(add-1 zero)

; retrieve the body of add-1 and substitute zero for its parameter n
(lambda (f) (lambda (x) (f ((zero f) x))))

; retrieve the body of zero and substitute
(lambda (f) (lambda (x) (f ((lambda (x) x) x))))

; simplify
(lambda (f) (lambda (x) (f x)))

So we can define one directly as:
(define one
(lambda (f) (lambda (x) (f x))))

Next we can define two by evaluating (add-1 one) using exactly the same principles.
(add-1 one)

; retrieve the body of add-1 and substitute one for its parameter n
(lambda (f) (lambda (x) (f ((one f) x))))

; retrieve the body of one and substitute
(lambda (f) (lambda (x) (f ((lambda (x) (f x)) x))))

; simplify
(lambda (f) (lambda (x) (f (f x))))

So two is defined directly as:
(define two
(lambda (f) (lambda (x) (f (f x)))))

Just as was explained earlier, the Church numeral one takes a procedure as an argument and returns a procedure that applies the input procedure once. The only difference in the Church numeral two is that it applies its input procedure twice.

Add these definitions in the Scheme interpreter and test them out the same as before.
> ((one inc) 0)
1
> ((one inc) 5)
6
> ((two inc) 3)
5
> ((two inc) 7)
9


Now we need to figure out how to add two Church numerals. We can see from the definition of add-1 that all it does is takes a Church numeral, n, and wraps one additional function call around it.
(define (add-1 n)
(lambda (f) (lambda (x) (f ((n f) x)))))

We can implement Church numeral addition by using the same approach. We'll take two Church numerals m and n. Instead of wrapping n in one extra function call, we'll wrap it in m extra calls.
(define (add-church m n)
(lambda (f) (lambda (x) ((m f) ((n f) x)))))

We can now use add-church to define several more Church numerals for testing.
> (define three (add-church one two))
> (define four (add-church two two))
> (define seven (add-church three four))
> ((three inc) 0)
3
> ((four inc) 0)
4
> ((seven inc) 0)
7


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Monday, October 25, 2010

SICP 2.5: Representing Pairs as a Product

From SICP section 2.1.3 What Is Meant by Data?

Exercise 2.5 asks us to show that we can represent pairs of non-negative integers using only numbers and arithmetic operations if we represent the pair a and b as the integer that is the product 2a3b. We're to give the corresponding definitions of the procedures cons, car, and cdr.

This exercise reminds me of the old adage that just because you can do something doesn't mean that you should. The implementations of cons, car, and cdr that follow are severely limited in that they can only join pairs of integers, not pairs of any type of data as we've seen before.

Regardless, here is what is meant by representing non-negative integers as the product 2a3b. Note that because 2 and 3 are both prime, and due to the fundamental theorem of arithmetic, any integer values that we insert for a and b will result in a unique value for the product 2a3b. Because the product is unique, we're able to get back the original values of a and b by just storing the product and remembering the rule we used to generate it.

The definition of cons is straightforward from the requirements using the built-in Scheme expt procedure.
(define (cons a b)
(* (expt 2 a)
(expt 3 b)))

But once we've combined two integers in this way, how do we separate them again? For example, if we evaluate (cons 5 6) we get a result of 23328. The corresponding implementation of car needs to take the value 23328 and tell us what that the original value of a was 5. We can do that by counting the number of times 23328 is evenly divisible by 2. Since 2 and 3 have no factors in common, once you divide all the 2s out of 23328 you'll be left with a power of 3. Since we're going to be doing the same thing for cdr, we'll define a general procedure that can be reused.

There's no easy way to find out how many times one number evenly divides another other than simply trying to divide by the number in a loop and testing for a remainder. If there was a quick formula that we could apply to get the answer in one step then prime factorization would be much easier than it is. We'll have to use an iterator to do this in as few steps as possible.
; count the number of times n is evenly divisible by d
(define (num-divs n d)
(define (iter x result)
(if (= 0 (remainder x d))
(iter (/ x d) (+ 1 result))
result))
(iter n 0))

I tested this with the example values above to make sure it gives the correct results.
> (num-divs 23328 2)
5
> (num-divs 23328 3)
6

Now we can define both car and cdr in terms of num-divs.
(define (car x)
(num-divs x 2))

(define (cdr x)
(num-divs x 3))

Finally, we can use the axiom of pairs (from Lecture 2B) to show that our new definitions for cons, car, and cdr work together as they should.
> (car (cons 4 7))
4
> (cdr (cons 4 7))
7


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Thursday, October 21, 2010

SICP 2.4: Implementing cons, car, and cdr

From SICP section 2.1.3 What Is Meant by Data?

Exercise 2.4 gives us the following alternative procedual implementation for creating pairs.
(define (cons x y)
(lambda (m) (m x y)))

(define (car z)
(z (lambda (p q) p)))

Using this implementation, we are told to verify that (car (cons x y)) yields x for any objects x and y using the substitution model from section 1.1.5.

Once we've done that, we need to write the corresponding definition of cdr.

Our first step to verifying that this implementation for cons and car work should be to put the code in a Scheme interpreter and see them work.
> (car (cons 2 3))
2
> (car (cons 15 "Hello"))
15
> (car (cons "Hi!" 42))
"Hi!"

Now let's use the substitution model to show that the following always yields x
(car (cons x y))

First we retrieve the body of cons
(lambda (m) (m x y))

We could replace the formal parameters x and y with the arguments 2 and 3 (or any others that we choose) at this point, but I'm going to keep working with x and y. Let's substitute the body of cons.
(car (lambda (m) (m x y)))

Now we can retrieve the body of car and substitute it.
((z (lambda (p q) p)) (lambda (m) (m x y)))

The car procedure takes a procedure z as an argument and applies it to (lambda (p q) p), so there's one more substitution we need to do here before this step is really done. We need to take the second lambda in the expression above and replace z with it.
((lambda (m) (m x y)) (lambda (p q) p))

That might look scary, but it's just a procedure applied to another procedure. Since the procedure on the left takes one formal parameter, we can substitute the procedure on the right for that parameter.
((lambda (p q) p) x y)

Now it should be clear that this expression is a procedure that takes two parameters p and q, and passes two parameters to it. The procedure returns the first of the two parameters passed to it, so this expression simplifies to x.

The definition for cdr that corresponds to the above implementation of car is simply
(define (cdr z)
(z (lambda (p q) q)))

We can test it out the same way we originally tested the code provided in the text,
by running it in the interpreter.
>  (cdr (cons 2 3))
3
> (cdr (cons 15 "Hello"))
"Hello"
> (cdr (cons "Hi!" 42))
42


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Thursday, October 7, 2010

SICP 2.3: Rectangles in a plane

From SICP section 2.1.2 Abstraction Barriers

Exercise 2.3 asks us to build on the results of exercise 2.2 and implement a representation for rectangles in a plane. In addition to a constructor and selectors for a representation of a rectangle, we'll also need to implement procedures for computing the perimeter and area of a given rectangle. Once we've done that we'll need to implement a different representation for rectangles. The challenge is to design a system with suitable abstraction barriers so that the area and perimeter procedures will work with either rectangle representation.

First let's look at a few different ways to represent rectangles. To simplify things, we'll assume that the sides of the rectangle are to be perpendicular to the x- and y-axes of the coordinate plane.

The most obvious representation would be to combine four points (ignoring the fact that four points might not form a rectangle and just accepting any four points for this representation). We could easily implement this representation since we already have a representation for points in a plane.


Next, we could represent a rectangle as a single point along with a width and a height. This would make it very easy to calculate the perimeter and area of a given rectangle.


Another representation could be to store two opposite corners of the rectangle. This would give us just enough information to calculate the width and the height, from which we can calculate the perimeter and area.


You can probably think of several other representations such as storing all four segments that make up the sides of a rectangle, storing only two adjacent sides, or storing the segment that joins two opposite corners forming a diagonal. These would all work, but some require us to store more information than we really need.

The first implementation will be the third one pictured above where a rectangle is represented by two opposite corners. The constructor will simply join two points together. The selctors will return the width and height of the rectangle. The width and height are calculated by taking the absolute value of the difference between the x- and y-coordinates respectively.
; rectangle constructor
; join two opposite corners
(define (make-rect a b) (cons a b))

; rectangle selectors
(define (rect-width r)
(abs (- (x-point (car r)) (x-point (cdr r)))))

(define (rect-height r)
(abs (- (y-point (car r)) (y-point (cdr r)))))

With that done, we can implement procedures for computing the perimeter and area of a rectangle.
(define (rect-perimeter r)
(* 2 (+ (rect-width r) (rect-height r))))

(define (rect-area r)
(* (rect-width r) (rect-height r)))

We can test it out by defining two points, creating a rectangle, and displaying its perimeter and area.
> (define a (make-point 0 0))
> (define b (make-point 2 10))
> (define r (make-rect a b))
> (display (rect-perimeter r))
24
> (display (rect-area r))
20

The box-and-pointer diagram of the rectangle created above shows how the individual elements are stored. The rectangle itself is made up of two points (from the last exercise), each of which are themselves made up of an x and a y value. The rect-width and rect-height selectors are used to access these x and y values to calculate the width and height of the rectangle.


Now we can implement an alternate representation of a rectangle and show that we can use the same procedures for perimeter and area. The easiest will be the second pictured above, where we represent a rectangle by a single point and its width and height.
; rectangle constructor
; combine a point, a width, and a height
(define (make-rect a w h) (cons a (cons w h)))

; rectangle selectors
(define (rect-width r)
(car (cdr r)))

(define (rect-height r)
(cdr (cdr r)))

Using the new constructor and selectors we can test that the old rect-perimeter and rect-area procedures work unchanged.
> (define a (make-point 0 0))
> (define r (make-rect a 2 10))
> (display (rect-perimeter r))
24
> (display (rect-area r))
20

The box-and-pointer diagram of this rectangle reveals that the elements are laid out exactly the same way internally, but the rectangle is different conceptually.



This time the rectangle is made up of a point, a width, and a height. The corresponding rect-width and rect-height procedures access their values directly rather than computing them.


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.