Saturday, October 31, 2009

Happy Halloween

You know you're a giant nerd when you're reminded of major holidays by your Scheme interpreter.

Jack-o-Lambda
I'm working on transcribing my notes from the next SICP lecture, so I'll have a real post up soon. Happy Halloween everyone!

Thursday, October 15, 2009

SICP Exercises 1.6 - 1.8

Exercises from section 1.1.7 Example: Square Roots by Newton's Method of SICP.

Exercise 1.6 gives the following alternative definition to the if special form:
(define (new-if predicate then-clause else-clause)
(cond (predicate then-clause)
(else else-clause)))
The new-if procedure seems to work with simple conditional statements on the command line, but if you then define the sqrt-iter procedure in terms of new-if, you notice something peculiar. The sqrt-iter procedure defined in terms of new-if never finishes executing.

To explain why, we need to recall that if is a special form. From section 1.1.6 of the text,
To evaluate an if expression, the interpreter starts by evaluating the "predicate" part of the expression. If the "predicate" evaluates to a true value, the interpreter then evaluates the "consequent" and returns its value. Otherwise it evaluates the "alternative" and returns its value.

In the if special form, after the predicate is evaluated, only one of the consequent or alternative will be evaluated. Never both. The new-if procedure that we've defined doesn't have this property. Our interpreter will use regular old applicative-order evaluation when it encounters the new-if procedure. Since the else-clause passed to new-if in the sqrt-iter procedure is always evaluated, sqrt-iter never stops making recursive calls to itself.


Exercise 1.7 points out a flaw in the implementation of the sqrt procedure when you try to find the square root of very small or very large numbers.

The problem is easy to spot for very small numbers. Try evaluating the following:
(sqrt 0.001)
and you should get a wrong answer. Instead of 0.0316227766 as you might expect, our sqrt procedure gives an answer of 0.0412454261. The problem is that as we refine our guess, we stop as soon as the square of our guess is within 0.001 of x. Squaring 0.0412454261we get 0.00170118517, which we can see is within our tolerance of 0.001.

For very large numbers the problem is a bit more difficult to pin down. If we try to evaluate the square root of a large enough number, we find that the interpreter never returns a value. For example, evaluating the following expression
(sqrt 9999999999998)
will never return. (I'm using PLT Scheme with the "Module" language selected for the exercises in this section. You may have different results.)

What might cause that? The answer is due to the nature of floating-point numbers. As values get larger and larger, their floating-point representation becomes less precise. As we keep recursively refining our guess in the square root procedure, if the value of the guess is large enough we're unable to represent it to within our tolerance of 0.001. This causes an endless sequence of recursive calls, since the interpreter will never reach a point where the guess is good enough.

The solution to this problem, as suggested in the question itself, is to redefine our good-enough? procedure so that it stops when our guess changes by a very small amount.

(define (square x)
(* x x))

(define (average x y)
(/ (+ x y)
2))

(define (good-enough? guess previous-guess)
(< (abs (- guess previous-guess))
0.001))

(define (improve guess x)
(average guess
(/ x guess)))

(define (sqrt-iter guess previous-guess x)
(if (good-enough? guess previous-guess)
guess
(sqrt-iter (improve guess x)
guess
x)))

(define (sqrt x)
(sqrt-iter 1.0 0 x))
Note that the good-enough-procedure no longer takes x as an input parameter, but instead decides whether or not a guess is good enough based on the change from the previous guess. This change also called for a change to the sqrt-iter procedure, which must now keep track of the previous guess. Finally, note that when I first call the sqrt-iter procedure in sqrt, I pass in values of 1.0 and 0 for the initial guess and the previous guess. This ensures that the difference will be greater than our tolerance for at least one recursive call of sqrt-iter.


Exercise 1.8 gives us the following formula for improving a guess when using Newton's method for computing a cube root.


This is analogous to averaging our guess with x/guess in order to refine our guess in the square root procedure. In order to compute the cube root, we can replace the square root improve procedure with the following definition.
(define (improve guess)
(/ (+ (/ x
(square guess))
(* 2 guess))
3))
Putting it all together (starting from the improved version of sqrt that we wrote in the previous exercise) we get:
(define (square x)
(* x x))

(define (good-enough? guess previous-guess)
(< (abs (- guess previous-guess))
0.001))

(define (improve guess x)
(/ (+ (/ x
(square guess))
(* 2 guess))
3))

(define (cbrt-iter guess previous-guess x)
(if (good-enough? guess previous-guess)
guess
(cbrt-iter (improve guess x)
guess
x)))

(define (cbrt x)
(cbrt-iter 1.0 0 x))

We can test it out with the values that gave us problems before, with the following results:
> (sqrt 0.001)
0.03162278245070105
> (sqrt 99999999999998)
9999999.9999999

Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Monday, October 12, 2009

Web developers, please stop stealing focus.

Some sites use a very simple bit of JavaScript to put the focus in a specific form element once the page is done loading. Google does this, as was pointed out by Patrick Altoft in document.f.q.focus(); The Billion Dollar Line of JavaScript. When your browser loads the Google home page, this snippet of code automatically places your keyboard's focus in the main search box.

This makes a lot of sense for Google. There's really only one form element on Google's home page. That's where the focus should go. There's really no harm in placing the focus there, and as Patrick pointed out, it's a good source of revenue for Google. I believe that is what's termed a win-win.

You know where this doesn't make sense? Any page with a password field. When I load Twitter's (for just one example) main page, I'm greeted by a log in screen. The problem is that I have plenty of time to type in my username and tab to the password field before the page finishes loading and Twitter's version of the "Billion-dollar line of JavaScript" switches my focus back to the username field. This has caused me to type my password in plain text into the username field on several occasions. I do not care for this.

Granted, this is only a mild irritant, and now that I know Twitter is stealing my focus I can just change my behavior to compensate. The point is that I shouldn't have to. It would be a lot simpler to change the behavior of Twitter's login page than it would be to change the behavior of millions of users.

Possible solutions

The simplest (and I think the best) solution is for everyone to simply stop adjusting page focus using JavaScript. Unless your form has exactly one text field (and I can only think of a few of these), you really shouldn't assume you know where focus should go. Let the user handle it.

Another possibility is to check what field currently has focus before you change it. If the user is already using your page, just let it go. If none of your form elements currently has focus, go ahead and suggest a starting point by focusing on the first form element.

Finally, as a compromise between the two solutions above (and this is my recommended solution), just don't steal focus on any page that contains a password field. If there's a possibility of revealing the user's login credentials to passers-by, then just don't take the risk.

Wednesday, October 7, 2009

SICP Exercises 1.1 - 1.5

Here are the answers to (most of) the exercises presented in section 1.1.6 Conditional Expressions and Predicates of Structure and Interpretation of Computer Programs.

Exercise 1.1 asks us to evaluate a series of simple Scheme expressions. I strongly advise anyone working along to input each expression to a Scheme interpreter to see how they evaluate for yourself.


Exercise 1.2 asks us to translate the following formula to prefix notation.


Checking with a calculator, I find that this expression evaluates to -37/150. The same expression in prefix notation would be
(/ (+ 5 4 (- 2 (- 3 (+ 6 (/ 4 5)))))
(* 3 (- 6 2) (- 2 7)))
You can check with a Scheme interpreter to verify that this expression also evaluates to -37/150.


Exercise 1.3 asks us to define a procedure that takes three numbers as arguments and returns the sum of the squares of the two larger numbers.

It's easy to define this procedure using the built-in Scheme procedures square, max, and min, as I did here. However, we haven't encountered these procedures yet, so I think we should define them as well.
(define (square x)
(* x x))

(define (max x y)
(if (> x y) x y))

(define (min x y)
(if (< x y) x y))


(define (sum-of-highest-squares x y z)
(+ (square (max x y))
(square (max (min x y) z))))
The sum-of-highest-squares works by adding the square of the maximum of x and y (not the lowest of the three) and the square of the maximum of the remaining two (the minimum of x and y, which will be whichever value was left over from the first step), and z.


Exercise 1.4 asks us to describe the behavior of the following procedure.
(define (a-plus-abs-b a b)
((if (> b 0) + -) a b))
Before the entire expression can be evaluated, the compound if expression must be evaluated so the interpreter can determine whether the + or - operator should be applied to the two operands a and b. If b i greater than 0, the + operator is applied, adding a positive value to a. Otherwise, the - operator is applied, subtracting a negative value. In either case, the result is the same. The absolute value of b is added to a.


Exercise 1.5 asks us to compare the behavior of the following two procedures when they're interpreted using applicative-order evaluation vs. normal-order evaluation.
(define (p) (p))

(define (test x y)
(if (= x 0)
0
y))
The first procedure simply makes a recursive call to itself. If you evaluate (p) directly in the Scheme interpreter, it will never return because each recursive call makes another recursive call with no base case defined. The test procedure is used to determine the evaluation order of the interpreter by running it with the recursive procedure as its second argument.

If the interpreter uses applicative-order evaluation (as we learned in section 1.1.5 of the text, is the case), the very first expression the interpreter sees
(test 0 (p))
has one operand and two arguments, all of which will be evaluated. The operand test will evaluate to the body of its procedure, the symbol 0 will evaluate to its value, and the operand (p) will evaluate to a recursive call to itself, which will never stop recursing.

If the interpreter uses normal-order evaluation, the (p) operand will not be evaluated until its value is needed. This is because when using normal-order evaluation the interpreter will substitute operand expressions for parameters. Since the conditional statement in the body is structured such that the second argument never needs to be evaluated, the entire test procedure will evaluate to 0 under normal-order evaluation.


Related:

For links to all of the SICP lecture notes and exercises that I've done so far, see The SICP Challenge.

Sunday, October 4, 2009

SICP - Notes from Lecture 1A

Structure and Interpretation of Computer Programs
Lecture 1 - Overview and Introduction to Lisp
Covers Text Section 1.1

You can download the video lecture or stream it on MIT's OpenCourseWare site. It's also available on YouTube.

This first lecture is presented by MIT's Professor Hal Abelson.

Part 1

Prof. Abelson opens the lecture with the following “definition”

Computer Science – It's not really a science, and not about computers.

He supports the first half of this definition with the idea that computer science might be engineering or art, but that it is more akin to magic than science. A process can be thought of as like a magical spirit that lives in the computer. A pattern of rules called a procedure is like a magical spell, and the language that these spells are written in is Lisp.

These are the only references in this lecture to “computer science as magic.” Since a wizard graces the cover of the course text and the introductory slides, I hope the idea will be fleshed out as some point.

The second half of the definition, that computer science isn't about computers is supported with the following observations:
  • Physics is not really about particle accelerators.
  • Biology is not about microscopes and petri dishes.
  • Geometry (originally from the Egyptian words for “Earth” and “measure”) is not about surveying instruments.
These are all observations that the different fields of science we study should not be confused with the tools that we use to study them. I think Edsger W. Dijkstra said it best when he said.
Computer Science is no more about computers than astronomy is about telescopes.

Prof. Abelson gives the following examples to illustrate the distinction between declarative knowledge and imperative knowledge (what he terms “how-to” knowledge).

Declarative – The square root of x is the y such that
y2 = x and y ≥ 0

Imperative – In order to find the square root of x (using Heron of Alexandria's method of successive averaging):
  • Make a guess, G.
  • Improve the guess by averaging G and x/G.
  • Keep improving the guess until the it is good enough.

Declarative knowledge is about stating and cataloguing facts. Imperative knowledge is about the methods and processes for finding knowledge (both declarative and imperative). Imperative knowledge is the meta-language for producing new knowledge of both types.

Techniques for controlling the complexity of large systems is what computer science is really about.

The three main techniques for controlling complexity in computer science, and the three main topics for discussion in the course are going to be
  • Black Box Abstraction
  • Conventional Interfaces
  • Meta-Linguistic Abstraction (making new languages)

Black Box Abstraction – From an outside observer's point of view, the internals of how a procedure works should not be important. The example given in the lecture is the square root procedure from earlier. If we write a square root procedure, it should be modular in the sense that someone else can take it and use it as one step in their own procedures, and not need to worry about how it works.

Another reason to use Black Box Abstraction is that your imperative knowledge might be expressed in terms of a more general form of some other piece of imperative knowledge. The example given is a strategy for finding fixed points. Given a general procedure for finding fixed points, you can use it to find square roots by passing it a function as its data.

Black Box Abstraction
  • Primitive Objects
  • Primitive Procedures
  • Primitive Data
Means of Combination
  • Procedure composition
  • Construction of compound data
Means of Abstraction
  • Procedure Definition
  • Simple data abstraction
Capturing Common Patterns
  • Higher-order procedures
  • Data as procedures
A programming language has primitive data and procedures as its basic building blocks. A language needs a means for combining these data and procedures to form more complex data and procedures. A language needs a means for abstracting data and procedures so they can be used as components in more complex procedures. Lisp has a means for defining common patterns in terms of procedures and data. Higher-order procedures are those whose inputs and outputs are themselves procedures.

Conventional Interfaces – agreed-upon ways of plugging things together.
  • Generic Operations
  • Large-scale structure and modularity
  • Object-oriented programming
  • Operations on aggregates

Meta-linguistic Abstraction
  • Talks about how you construct new languages.
  • Looks at the process of interpretation and evaluation of code.


Part 2

In the first part of the lecture, Prof. Abelson revealed that, like chess, the rules of Lisp can be learned in a few minutes, but take years to master. The second part of the lecture talks about Lisp specifically, but more generally about the three main aspects of any programming language.

Primitive Elements – What does the language include in terms of data and procedures?
Means of Combination – How do you put the primitive elements together to build bigger things out of them?
Means of Abstraction – How do you use the combinations of primitive elements that you create as though they were primitive elements themselves?

Primitive Elements of Lisp: Numbers and mathematical operators make up the primitive elements in Lisp. Some examples are

3, 17.4, 5, +, -, *, and /

No real surprises here.

Means of Combination in Lisp:
A combination consists of applying an operator to some operands. Lisp uses prefix notation. The operator is written to the left of the operands. The parentheses make combinations unambiguous. A simple combination adding three numbers would be:
(+ 3 17.4 5)

The operands themselves can also be combinations instead of primitive data values.
(+ 3 (* 2 4) 7)

In the MIT Scheme interpreter environment, you hit C-x C-e (hold down the Control key while hitting x, then e) in order to evaluate an expression you've typed in. Evaluating the expression above gives a result of 18. This is the sum of 3, plus the product of 2 and 4, plus 7.

Prefix notation also makes it easy to show the structure of a program using a tree diagram. The following combination
(* (+ 2 (* 4 6))
(+ 3 5 7))
is represented by the following tree diagram (from section 1.1.3 of the course text, Evaluating Combinations)


Note that the nodes of the tree (including the non-leaf nodes) are represented by the value that the expression for that node evaluates to. If you start at the leaf nodes, the evaluation for the entire tree "percolates up" to the root node. This form of evaluation is a process known as tree accumulation.

Means of Abstraction in Lisp: Lisp uses the define keyword to define both variables (abstract data) and to apply names to procedures.
(define (square x)
(* x x))

You can then use the new square procedure just like any other operator in Lisp.
(square 10)

An alternative syntax for the definition of square above is:
(define square (lambda (x)
(* x x)))
Technically, the first syntax we looked at is the alternative, and this syntax is the more "correct" one. In Lisp, the lambda keyword is used to define a function. If you're giving a function a name, the alternative syntax we looked at first is a shorter way to do the same thing. Later on in the course we'll be looking at situations where we might want to create a function without explicitly giving it a name. We'll come back to the lambda operator then. For now, we'll use the short syntax for defining procedures.

The following function computes the average of x and y.
(define (average x y)
(/ (+ x y) 2))
The average of x and y is computed by dividing the sum of x and y by 2.

Now that the procedures for square and average are defined, you can use them just as you would any primitive procedure in Lisp. This is because Lisp does not make arbitrary distinctions between primitives of the language and procedures defined by the user.

The last section of the second part of the lecture explains how to do Case Analysis (Conditionals) in Lisp.

Consider the following definition of the absolute value function:

|x| = -x for x < 0
0 for x = 0
x for x > 0


The absolute value function could be implemented in Lisp using a conditional expression as follows:
(define (abs x)
(cond ((< x 0) -x))
((= x 0) 0)
((> x 0) x)))
Refer to section 1.1.6 of the text, Conditional Expressions and Predicates for a more complete discussion of conditionals, including the most common logical composition operators (and, or , and not) which are not covered in the lecture.


Part 3

We now know enough Lisp to implement any numerical procedure.

The main lecture concludes with an implementation of Heron of Alexandria's square root procedure seen earlier in the lecture.

To find an approximation to sqrt(x)
  • make a guess G
  • average G and x/G
  • keep improving the guess until it's good enough
  • use 1 as an initial guess

Each of these steps can be defined as a separate procedure, some of them in terms of other procedures we've seen already.

(define (try guess x)
(if (good-enough? guess x)
guess
(try (improve guess x) x)))

(define (sqrt x) (try 1 x))

(define (improve guess x)
(average guess (/ x guess)))

(define (good-enough? guess x)
(< (abs (- (square guess) x))
0.001))

Notice that the try procedure is defined partially in terms of itself. The try procedure defined above checks an initial guess, then calls itself with a refined guess if the initial guess is not good enough. This new guess is then checked to see if it is good enough, and the guess is defined again, and try will be called again with the new refined guess. This will go on until the guess is refined to the point where it is close enough. This is called a recursive definition. The ability to make recursive definitions allows you to do infinite computations that go on until a condition is reached (or just go on forever, if you make a mistake).


Summary

The following table shows what we've learned so far, and what we have yet to learn.







ProceduresData
Primitive
Elements
+ - * / =23 1.738
Means of
Combination
( )
cond if

Means of
Abstraction
define

We've learned the primitive elements for both data and procedures. We've learned how to combine functions by composing expressions. We've also learned how to abstract procedures by defining our own. We have yet to learn methods for combining and abstracting data. The next couple of lectures will be talking about making a link between the procedures that we write and the processes that happen in the machine, and how to use Lisp to not only make small computations (like we've done so far) but also about “general conventional methods of doing things.”

The SICP Challenge

Late last year I wrote a post titled Books Programmers Don't Really Read. There was quite a bit of debate about the books I included on the list, and even some about a few I left off. The overwhelming consensus seemed to be that the one glaring omission was a little book called Structure and Interpretation of Computer Programs (SICP). I said at the time that I would add it to my queue of books to read, and now it's time to make good on that promise.

To make things even more interesting (and to keep myself motivated), I've decided to follow the MIT OpenCourseWare program that uses the book and blog about my progress.

The resources I plan to use include the following:
I'll keep this list updated with any additional resources that I find useful along the way. Suggestions are always welcome.

My personal goals for the program are:
  1. To learn the Scheme programming language (a dialect of Lisp).
  2. To read the entire SICP book.
  3. To watch all of the MIT video lectures and post my transcribed notes here on my blog.
  4. To do the exercises in each chapter of the book, and the projects listed in the MIT course syllabus. Solutions to exercises and projects (with explanations of both, of course) will also be posted here.
My next post will be my notes from the first lecture, with solutions to exercises to follow. I hope this series sparks some interest in other programmers who want to read SICP, learn Scheme, and maybe even take the course along with me.

Lecture Notes
Lecture 1A - Overview and Introduction to Lisp
Lecture 1B - Procedures and Processes: Substitution Model
Lecture 2A - Higher-order Procedures
Lecture 2B - Compound Data
Lecture 3A - Henderson Escher Example


Exercises
1.1 - 1.5
1.6 - 1.8
1.9 - 1.10
1.11, 1.12, 1.13
1.14, 1.15
1.16, 1.17, 1.18, 1.19
1.20
1.21, 1.22, 1.23, 1.24, 1.25, 1.26, 1.27, 1.28
1.29, 1.30, 1.31, 1.32, 1.33
1.34
1.35, 1.36, 1.37, 1.38, 1.39
1.40, 1.41, 1.42, 1.43, 1.44, 1.45, 1.46

2.1
2.2, 2.3
2.4, 2.5, 2.6
2.7, 2.8, 2.9, 2.10, 2.11 2.12, 2.13, 2.14, 2.15, 2.16
2.17, 2.18, 2.19, 2.20, 2.21, 2.22, 2.23
2.24, 2.25, 2.26, 2.27, 2.28, 2.29, 2.30, 2.31 , 2.32
2.33, 2.34, 2.35, 2.36, 2.37, 2.38, 2.39 2.40, 2.41, 2.42, 2.43
2.44, 2.45, 2.46, 2.47, 2.48, 2.49, 2.50, 2.51, 2.52
2.53, 2.54, 2.55, 2.56, 2.57, 2.58, 2.59, 2.60, 2.61, 2.62, 2.63 - 2.65, 2.66


Other Resources:
I'm not the first blogger to attack SICP, and I probably won't be the last. If you're looking for help with a problem that I haven't gotten to yet (or if you just want to check my answers), here's a list of a few other places to find SICP solutions:
  1. Ken Dyck's SICP Solutions
  2. Community Scheme Wiki - SICP Solutions
  3. Eli Bendersky's website