The 3n+1 problem
The Racket Blog 2013-03-15
1 Introduction
I’m starting to go through Programming Challenges: The Programming Contest Training Manual, by Steven S. Skiena and Miguel Revilla. I thought it would be fun to show how to approach the problems using the Racket programming language. Rather than use a small, toy educational subset of the language, I’ll take off the kid gloves, and use whatever’s available in #lang racket.2 The problem
The 3n+1 problem is as follows:
consider a positive number n. The cycle length of n is the number of
times we repeat the following, until we reach n=1:
- If n is odd, then n ⇐ 3n+1
- If n is even, then n ⇐ n/2
2.1 The plan
Before we do any real coding, let’s figure out a plan of attack and how to test that plan.- We need a way of computing cycle-length.
- We need to run cycle-length over a range of values and pick out the biggest result.
; cycle-length: positive-integer -> positive-integer ; Computes the cycle length of n, according to ; the 3n+1 rules. > (define (cycle-length n) 42)This is certainly not right, but it’s a start. And it’s something we can test!
3 Test cases
We want
(cycle-length 1) ==> 1
; Load up rackunit: > (require rackunit) ; Let's express that test: > (check-equal? (cycle-length 1) 1) -------------------- FAILURE name: check-equal? location: (eval 4 0 4 1) expression: (check-equal? (cycle-length 1) 1) actual: 42 expected: 1 Check failure -------------------- ; A few more tests, according to the problem statement above: > (check-equal? (cycle-length 2) 2) -------------------- FAILURE name: check-equal? location: (eval 5 0 5 1) expression: (check-equal? (cycle-length 2) 2) actual: 42 expected: 2 Check failure -------------------- > (check-equal? (cycle-length 4) 3) -------------------- FAILURE name: check-equal? location: (eval 6 0 6 1) expression: (check-equal? (cycle-length 4) 3) actual: 42 expected: 3 Check failure -------------------- > (check-equal? (cycle-length 5) 6) -------------------- FAILURE name: check-equal? location: (eval 7 0 7 1) expression: (check-equal? (cycle-length 5) 6) actual: 42 expected: 6 Check failure -------------------- > (check-equal? (cycle-length 22) 16) -------------------- FAILURE name: check-equal? location: (eval 8 0 8 1) expression: (check-equal? (cycle-length 22) 16) actual: 42 expected: 16 Check failure --------------------All of our test cases fail. Hurrah!
4 A solution
Ok, now that we coded up the tests, let’s write a solution. We can write out a definition for cycle-length almost straight out of the problem statement:> (define (cycle-length n) (cond [(= n 1) 1] [(odd? n) (add1 (cycle-length (add1 (* 3 n))))] [(even? n) (add1 (cycle-length (/ n 2)))])) ; Let us try it out on a few values: > (cycle-length 1) 1 > (cycle-length 2) 2 > (cycle-length 22) 16If we run this through our test suite, we should be fairly confident that cycle-length is probably doing the right thing. (... modulo crazy inputs into the function such as 0. If we want to guard against such inputs, we can use the features in racket/contract.)
5 Optimizing cycle-length
How fast is the performance for cycle-length? Let’s try timing it for a few values, using the time utility. We’ll run cycle-length for a range of numbers, and see how long it takes.> (time (for ([i (in-range 1 100000)]) (cycle-length i))) cpu time: 890 real time: 889 gc time: 0
5.1 Introducing an accumulator
There are a few things we might do to improve the performance of this. Having the (add1 ...) in the definition, waiting until the recursion finishes up, seems ok, but I’m curious to see whether writing the definition with an explicit accumulator will help us.> (define (cycle-length n) (cycle-length/acc n 1)) ; Helper function: > (define (cycle-length/acc n acc) (cond [(= n 1) acc] [(odd? n) (cycle-length/acc (add1 (* 3 n)) (add1 acc))] [(even? n) (cycle-length/acc (/ n 2) (add1 acc))]))
With this reformulation, how does this do now?
This does help. Although we do get an improvement, let’s drop this
version for now and go back to our previous definition since it’s
simpler—and because the next potential optimization will work better
on it! 5.2 Adding memoization
Another thing that comes to mind is this: our first good version of cycle-length works recursively. More to the point: repeated use of cycle-length can reuse prior results that we computed earlier. Maybe memoization will help. Let’s try it out: we’ll keep a small table of results, and consult that to see if we’ve already encountered the solution before.; We'll maintain a table of known results. > (define table (make-hash)) > (define (cycle-length n) (cond ; Consult the table: [(hash-has-key? table n) (hash-ref table n)] [else ; If we can't find it, compute it... (define answer (cond [(= n 1) 1] [(odd? n) (add1 (cycle-length (add1 (* 3 n))))] [(even? n) (add1 (cycle-length (/ n 2)))])) ; ... and then put it into the table. (hash-set! table n answer) ; Don't forget to return the value back! answer]))Does the overhead of setting up this table pay for itself? Let’s see:
> (time (for ([i (in-range 1 100000)]) (cycle-length i))) cpu time: 217 real time: 217 gc time: 44Hey, not bad at all! That’s significantly better. We should make sure, of course, that all our test cases are running on this ok.
> (check-equal? (cycle-length 1) 1) > (check-equal? (cycle-length 2) 2) > (check-equal? (cycle-length 4) 3) > (check-equal? (cycle-length 5) 6) > (check-equal? (cycle-length 22) 16)All’s quiet on the cycle-length front. The tests are all passing.
5.3 Advanced: abstracting memoization to a helper macro
It turns out that the kind of memoization we’ve done here can be lifted out, so that we can easily perform it at will. That is, what we’re doing is something like the following:Given a definition that we’d like to memoize:
- create a table for exclusive use by the definition, and
- slightly tweak the definition’s body so it consults the table before going through computation.
In terms of Racket, we can say that like this:
; A little helper to centralize the memoization logic ; into a single rewrite rule: > (define-syntax-rule (define/memo (name id) body ...) (begin (define table (make-hash)) (define (name id) (cond [(hash-has-key? table id) (hash-ref table id)] [else (define answer (begin body ...)) (hash-set! table id answer) answer]))))
> (define/memo (cycle-length n) (cond [(= n 1) 1] [(odd? n) (add1 (cycle-length (add1 (* 3 n))))] [(even? n) (add1 (cycle-length (/ n 2)))]))which is nice because it’s easy to read.
6 Cycling back to a loop
Now that we have a fairly robust cycle-length function, we can do the rest of the problem. Given a range of numbers, we want to go through them, compute the cycle lengths, and pick out the biggest one. We can try to write this directly with a for/list to create a list of all the cycle-lengths, and apply the max across that list. Let’s write this in code:> (define (max-cycle-length-range i j) (apply max (for/list ([n (in-range i (add1 j))]) ; (add1 j) for inclusion ... (cycle-length i))))
Let’s write a few test cases to make sure that this is computing the right
thing:
; From the "Sample Output" section of ; http://acm.uva.es/p/v1/100.html > (check-equal? (max-cycle-length-range 1 10) 20) -------------------- FAILURE name: check-equal? location: (eval 31 0 31 1) expression: (check-equal? (max-cycle-length-range 1 10) 20) actual: 1 expected: 20 Check failure --------------------
> (define (max-cycle-length-range i j) (apply max (for/list ([n (in-range i (add1 j))]) (cycle-length n))))Thank goodness for test cases. Ok, let’s try that again.
> (check-equal? (max-cycle-length-range 1 10) 20) > (check-equal? (max-cycle-length-range 100 200) 125) > (check-equal? (max-cycle-length-range 201 210) 89) > (check-equal? (max-cycle-length-range 900 1000) 174)All passing? Much better!
6.1 Advanced: maximizing a loop
It would be nice if we could directly express taking the maximum across a for loop. We’re performing the maximum computation by first constructing a list of all the cycle lengths, then applying max over the whole list. Can we avoid that auxiliary list construction, and just compute the max as we’re running through the numbers? In fact, there are several variations of for loops in Racket, so maybe one of those variations will work for us. For example, we could use for/fold, which gives us enough expressive power to take the maximum during iteration.> (for/fold ([current-max -inf.0]) ([n '(3 1 4 1 5 9 2 6)]) (if (> n current-max) n current-max)) 9There are other versions of for loops, such as the one for taking sums (for/sum). But as of this writing, there doesn’t seem to be be a for/max form that lets us take the maximum directly. The question arises: how difficult is it to build for/max? It turns out that it’s not too bad, though it requires a little more macrology: we’ll use for/fold/derived to express our own for/max loop in terms of folding:
> (define-syntax (for/max stx) (syntax-case stx () [(_ clauses . defs+exprs) (with-syntax ([original stx]) #'(for/fold/derived original ([current-max -inf.0]) clauses (define maybe-new-max (let () . defs+exprs)) (if (> maybe-new-max current-max) maybe-new-max current-max)))]))Essentially, as we’re looping through numbers, we maintain a current-max, and update that max accordingly as we walk across the iteration. The rest of the code in for/max delegates the rest of the gruntwork to for/fold (technically, for/fold/derived). We must test this, of course:
; Edge case: if we take the maximum of no numbers, ; let's see -inf.0. > (check-equal? (for/max ([i '()]) i) -inf.0) > (check-equal? (for/max ([i '(3 1 4 1 5 9 2 6)]) i) 9) > (check-equal? (for/max [(i (in-range 1 23))] i) 22) > (check-equal? (for/max ([n '(3.14159 2.71828 1.61803)] [s '(-1 1 1)]) (* n s)) 2.71828) ; ... and of course... > (check-equal? (for/max [(i (in-range 900 (add1 1000)))] (cycle-length i)) 174)Looks good. With this, let’s express max-cycle-length-range in terms of for/max now:
> (define (max-cycle-length-range i j) (for/max ([n (in-range i (add1 j))]) (cycle-length n)))
7 Making a module
Now that we have most of the solution worked out, let’s make a module that encapsulates what we’ve done. Let’s lift up the definitions that we used to make the solution nice and pretty, and place them into "helpers.rkt":
"helpers.rkt"
#lang racket (provide for/max define/memo) (define-syntax (for/max stx) (syntax-case stx () [(_ clauses . defs+exprs) (with-syntax ([original stx]) #'(for/fold/derived original ([current-max -inf.0]) clauses (define maybe-new-max (let () . defs+exprs)) (if (> maybe-new-max current-max) maybe-new-max current-max)))])) (define-syntax-rule (define/memo (name id) body ...) (begin (define table (make-hash)) (define (name id) (cond [(hash-has-key? table id) (hash-ref table id)] [else (define answer (begin body ...)) (hash-set! table id answer) answer])))) (module+ test (require rackunit) (check-equal? (for/max ([i '()]) i) -inf.0) (check-equal? (for/max ([i '(3 1 4 1 5 9 2 6)]) i) 9) (check-equal? (for/max [(i (in-range 1 23))] i) 22) (check-equal? (for/max ([n '(3.14159 2.71828 1.61803)] [s '(-1 1 1)]) (* n s)) 2.71828))
Who knows? We might reuse "helpers.rkt" sometime.
(You may note that the bottom of "helpers.rkt" contains a
test
submodule which collects the unit tests that we’ve written. We can
run a module’s test suite by using
raco test.)
With our "helpers.rkt" in in hand, let’s put our solution in
"three-n-plus-one.rkt":
"three-n-plus-one.rkt"
#lang racket (require "helpers.rkt") (define/memo (cycle-length n) (cond [(= n 1) 1] [(odd? n) (add1 (cycle-length (add1 (* 3 n))))] [(even? n) (add1 (cycle-length (/ n 2)))])) (define (max-cycle-length-range i j) (for/max ([n (in-range i (add1 j))]) (cycle-length n))) (module+ test (require rackunit) (check-equal? (cycle-length 1) 1) (check-equal? (cycle-length 2) 2) (check-equal? (cycle-length 4) 3) (check-equal? (cycle-length 5) 6) (check-equal? (cycle-length 22) 16) (check-equal? (max-cycle-length-range 1 10) 20) (check-equal? (max-cycle-length-range 100 200) 125) (check-equal? (max-cycle-length-range 201 210) 89) (check-equal? (max-cycle-length-range 900 1000) 174) (check-equal? (for/max [(i (in-range 900 (add1 1000)))] (cycle-length i)) 174))
8 Integrating with I/O and a main
Finally, all this unit testing is fine and dandy, but we don’t actually read input from standard input. Let’s fix that, and modify "three-n-plus-one.rkt" so it can be run as the main entry point.
We can read individual lines as strings by iterating across
current-input-port with in-lines:
Once we have a line in hand, we can parse out the individual chunks
with read. read doesn’t normally read from strings
directly, so we first translate each string into a port-like value
using open-input-string.
Last of all, let’s add the following to the bottom of
"three-n-plus-one.rkt": (module+ main (for ([line (in-lines (current-input-port))]) (define line-port (open-input-string line)) (define i (read line-port)) (define j (read line-port)) (when (and (number? i) (number? j)) (printf "~a ~a ~a\n" i j (max-cycle-length-range i j)))))This defines a main submodule. When we run "three-n-plus-one.rkt" directly from the command line, it will run main:
$ cat sample-data.txt 1 10 100 200 201 210 900 1000 $ racket three-n-plus-one.rkt < sample-data.txt 1 10 20 100 200 125 201 210 89 900 1000 174