Sunday 4 March 2012

Ruby: Exceptions and Continuations

Years ago, I stumbled across a paper describing user interface continuations. At the time, the concept of continuations seemed like nothing short of wizardry: the program would be at one place, processing instructions, and then would suddenly be somewhere else to collect a piece of data, and the back at the original place, with that collected data available to the computation that was happening originally.


A nice theoretical exercise, but that could never be useful, right?
Well, since then, I've learned Common Lisp, and then its sibling, Scheme. Continuations aren't a part of Common Lisp, but they're readily available in Scheme and, looking back, CL's awesome condition system starts to look like a regular exception framework with continuations thrown in for some fun.


So far, this is all sounding somewhat academic: papers and arcane languages don't have anything to do with what programmers do on a day to day basis, right?


Ruby is a modern scripting language, and supports continuations out of the box: they're built right into the Kernel module as the callcc method (the method name has been lifted from Scheme, where it's called call/cc, or call-with-current-continuation, if you like typing).


This method takes a one-arg block, where the argument is supplied by the system and represents the current continuation, which is a representation of where the program is at the moment that it is created. So what can you do with that? Well, Continuations can be call'ed, and when they are the value of the callcc call becomes the value that the continuation is call'ed with, regardless of where in the program the call was made. The continuation is a regular object that can be stored in data structures, passed around, etc., but when it's invoked, program flow resumes from the site of the callcc.


This needs an example.


I mentioned Common Lisp's condition system earlier. It's analagous to the exception mechanism in languages like Java and Python, with one notable difference: when a condition is signalled, the stack is not unwound to an enclosing exception handler. Instead, the stack is searched for a handler, which then gets to look at the condition and, if it determines that there is remedial action that can be taken, can provide information to the exception site that can tell the code there how to proceed. These are called restarts.


Where would this be useful?
A simple example from Peter Seibel's Practical Common Lisp is a log file parser. Imagine you're writing this, and you've sensibly layered the different functions: from the abstract 'parse a log file with this filename', through 'process all log entries in the file', into 'extract a single log entry', and then 'analyse a single log entry'.


But what if something goes wrong at the analysis level? What do you do with a malformed entry? If you're working with functions, you just have to pass something into the function that tells it how to handle that. If it's an object, set some property on the object for this situation. But what if it is, as in this case, several layers down from the application's interface? Well, you can pass some property dictionary or other miscellaneous contextual information into either the intervening functions or objects.


This kind of action indicates a break in reasoning: you're setting or passing properties on something that really shouldn't need to care about their existence. This kind of clutter makes maintenance programming difficult, as objects and functions are littered with things that they don't use themselves, but instead are made aware of for the sole purpose of handing over to something else. This has adverse effects upon reusability, as it's now assumed that these objects are part of a particular call chain.


Looking at it, the only layers that need to know about the problem are the bottom one, where the problem occurs, and the top one, where the business logic lives.


With continuations, we can make this happen. Here's an example.


  class Condition < Exception
    attr_accessor :continuation, :payload
    def initialize(continuation, payload)
      self.continuation = continuation
      self.payload = payload
    end
    def continue(value)
      @continuation.call(value)
    end
  end
  
  def topLevel
    begin
      intermediateLayer1
    rescue Condition => c
      c.continue(0 - c.payload)
    end
  end
  
  def intermediateLayer1
    intermediateLayer2
  end
  
  def intermediateLayer2
    intermediateLayer3
  end
  
  def intermediateLayer3
    fragileLayer
  end
  
  def fragileLayer
    (1..5).each { |i|
      i = callcc { |cc|
        begin
          processEntry(i)
        rescue Exception
          raise Condition.new(cc,i)
        end
      }
      puts i
    }
  end
  
  def processEntry(entry)
    (entry % 2 == 1) ? (raise "Can't deal with odd numbers!") : entry
  end
  
  topLevel()


The purpose of this program is quite simple: a top-level caller gets some work done (in this case, printing the numbers from 1 to 5) by asking a lower layer. The intermediate layers exist to demonstrate that there's no direct linkage between the raiser of the exception and its handler.


In this example, the bottom layer refuses to work with odd numbers, and raises an exception when given one. This is caught, but the decision as to what to do next is not appropriate for that low level: the business logic needs to make that decision, but it's several layers up in the stack.


At this point, a continuation is captured with callcc, and a new ContinuableException is raised. There's nothing special about these objects: they just encapsulate the continuation and the data that caused the error.


Normally, an exception propagating up the stack causes the intermediate stack frames to become inaccessible and therefore eligible for garbage collection. However, the continuation captured in the exception that's just been thrown refers the stack frame in which it was created, so the stack remains live, even if control flow is being unwound through it.


Now, the wizardry: the top level handler has access to the continuation and the problematic value, so it can decide what to do next. It can re-raise the exception, or it can provide a new value to the original source of the exception to be used in its place. Continuations can be call'ed, and they take a value to treat as the return value of the callcc call. Lower-level processing can continue as though it hadn't been interrupted; the intermediate layers are not unwound or invoked again.


The output of the above is just:

-1
2
-3
4
-5


Now, there's no need to follow this precise pattern. The value that's returned could instead be a Symbol that indicates which of a range of choices should be executed. It could be a Proc, which the receiver is expected to call.
Pretty neat, huh?

No comments:

Post a Comment