How to write a Lisp evaluator in Haskell
I promised my students last semster that I’d show how to write (some of) a Lisp in Haskell, in order to demonstrate the usefulness of algebraic data types for prototyping structured applications – and also for fun, obviously. Source code for everything in this post can be found here.
This short tutorial is targeted at Haskell beginners, and doesn’t even use a single monad, unlike more popular and full-featured tutorials like write yourself a scheme and write you a haskell. I achieve brevity and simplicity in the following way:
-
Only implementing the “EPL” part of the REPL, because parsing is Boring.
-
Not implementing macros.
-
Cutting lots of corners.
Ready? Let’s begin.
Abstract syntax
We’re going to assume somebody already implemented the Lisp’s Reader for us; it has the following type:
reader :: [Char] -> Program
i.e. the Reader converts a character stream into a Haskell data structure which we can manipulate. So what’s a Lisp program? Why, nothing more than a sequence of S-Expressions.
type Program = [SExp]
data SExp = Literal Integer -- e.g. 5, 42
| Symbol String -- e.g. x, first, lambda, -
| SExp [SExp] -- e.g. (+ (* 3 2) 4), (define fact (lambda n (fact (- n 1))))
Note that we are using Strings in our compiler to denote Lisp symbols; the Lisp we are writing has no string primitives of its own, although there’s no reason why it can’t.
It’s handy to implement an instance of Show
for SExp
(omitted) for
debugging.
Values
Now let’s define datatypes for values, i.e. the thing the S-Expressions evaluate into. This is also where we would define a type for macros… if we had any.
data Value = Number Integer
| List [SExp]
| Lambda (Value -> Value)
-- omitted implementation of `Show` --
Our syntax for lambda functions will look like (lambda x (+ x 2))
, and
only takes one argument for simplicity. To make functions with multiple
arguments one could just write functions that just return functions, e.g.
(lambda x (lambda y (+ x y)))
. We’ll introduce currying by default, so
(f x y z)
is equivalent to
(((f x) y) z)
We also need to encapsulate the symbol table in a type, which associates
Value
s to their identifiers. Presumably the symbol table already has some
definitions in it before the program is evaluated, e.g. with definitions of
+
and cons
.
type SymbolTable = [(String, Value)]
Evaluation
Evalutating an S-Expression (in this language at least) always produces a value and sometimes also updates the symbol table (e.g. when we define a new function). Hence we have the type:
-- Yes, a State monad is applicable here, but it's not really necessary
-- for this exposition.
eval :: SExp -> SymbolTable -> (Value, SymbolTable)
Now all we need to do is define evaluation on a case-by-case basis for each
kind of SExp
. The first two are trivial to define:
-- If the expression is just an integer, it has no effect on the symbol table
-- but we should print out that integer.
eval (Literal n) table = (Number n, table)
eval (Symbol s) table =
case (lookup s table) of
Just value -> (value, table)
Nothing -> error ("The value " ++ s ++ " is not defined! Aborting.")
Certain keywords are neither functions nor macros, but built right in to the
language. Specifically, we’ll add define
, quote
, and lambda
. I leave
let
and defmacro
as exercises.
In the interest of simplicity, our language will require that define
be a
top-level expression (though we won’t throw an error if it’s not). Hence
something like (+ (define x 3) 4)
will not update the symbol table.
eval (SExp (Symbol "define" : Symbol s : expr : [])) table =
let value = fst $ eval expr table
in
-- Return the new value and update the symbol table, overwriting the old
-- definition in the stupidest way possible.
(value, (s, value) : table)
-- Note that in real Lisps, the expression '(1 2 3) is a *reader shorthand*
-- that expands to (quote 1 2 3).
eval (SExp (Symbol "quote" : exprs)) table = (List exprs, table)
Now, lambda
is a bit less straightforward.
To evaluate a term like (lambda x (+ x 2))
we need to construct
a Haskell function of type Value -> Value
which will associate the given
value with the symbol x
. Luckily, eval
does just that.
eval (SExp (Symbol "lambda" : Symbol s : expr : [])) table =
let fn = \value -> fst $ eval expr ((s, value) : table)
in
(Lambda fn, table)
And finally, the general case, where the first value evaluates to some function and we do curried function application.
-- This is the general case,
eval (SExp (car : cdr)) table =
let (fn, _) = eval car table
args = map fst $ map (\expr -> eval expr table) cdr
in
(applyRecursive fn args, table)
-- This will get us Curried function application.
applyRecursive :: Value -> [Value] -> Value
applyRecursive f args = foldr next f args where
next (Lambda fn) value = fn value
next not_a_function _ = error ("Cannot apply non-function " ++ (show not_a_function))
(R)EPL
All that’s left is to show how to evaluate a sequence of S-Expressions.
repl :: Program -> [Value]
repl prog = fst $ foldr eval_and_print (customPrelude, []) prog where
eval_and_print :: SExp -> ([Value], SymbolTable) -> ([Value], SymbolTable)
eval_and_print expr (values, table) =
let (newval, newtable) = eval expr table
in
(newval : values, newtable)
It’s not terribly fun to write some primitive functions like +
and cons
in
the meta-language, but certainly possible. We’d put load them all into the
initial symbol table customPrelude :: SymbolTable
mentioned in repl
.
main = putStrLn $ repl
[SExp [Symbol "+", Literal 5, Literal 2]]
-- Prints "[7]"