We're still under construction!

And you just found out where. We're still working on this page. How did you get in here, anyway?

Writing a Lambda Calculus interpreter in Haskell

We write a rather small lambda calculus expression parser, evaluator, and repl in Haskell.

Lambda Calculus real quick

Lambda Calculus is a universal model of computation proposed by Alonzo Church in the 1930's, and it is an incredibly small one at that. It consists of simple expressions constructed out of 3 things:

variables
lambda abstractions
function applications

What are these? In friendly terms, a variable is just something that we don't know yet, and so we've given it a name instead. A lambda abstraction tells us how to replace a variable with it's value when we learn it. And a function application is the act of learning what a lambda abstraction's variable is.

How do we use this, say, to add two plus five? Well, it's easy! add five two, where add is a function taking two arguments.

Whereas Turing machines invoke clunky mechanisms, Church's calculus is a language, with a flexibility and flow not unlike that of the spoken language, and how each spoken word produces a sentence fragment that is awaiting the next word of the rest of the sentence.

This article isn't going to explain lambda calculus in depth, as it rather serves as a nice starting point for such explorations later. If you've been doing Haskell, you've been doing lambda calculus the whole time, anyway.

With that out of the way, let's get started! First off, we'll create a new cabal project file. You may prefer to use stack if you like. We're going to pull in a few libraries - containers for the usual suspects, parsec for parsing, and repline for interactive input - and set some extensions (nothing fancy, just FlexibleContexts, NoImplicitPrelude, and OverloadedStrings)

cabal project file - lc-simple.cabal

cabal-version:      3.0
name:               lc-simple
version:            0.0.1
license:            NONE
author:             Leo
maintainer:         leo@apotheca.io
build-type:         Simple
extra-source-files: README.md
library
    exposed-modules:
        Lambda.Simple
        Lambda.Simple.Eval
        Lambda.Simple.Expr
        Lambda.Simple.Parse
        Lambda.Simple.Repl
    default-extensions:
        FlexibleContexts
        NoImplicitPrelude
        OverloadedStrings
    build-depends:
        base >= 4 && < 5,
        containers,
        parsec,
        repline
    hs-source-dirs:   src
    default-language: Haskell2010

Lambda.Simple.Expr

Let us define our expression data type. As you can see, an expression is made up of variables, lambda abstractions, and function applications.

The most important part is this part:

data Expr
    = Var Name
    | Lam Name Expr
    | App Expr Expr

Our definition here is a lot less threatening than our earlier descriptions, don't you think? As you can see, an Expr is either a Var which is just a Name, a Lam which is a Name and another Expr, or a App which is two Exprs.

In addition to our data type, we've added a few convenience functions for encoding our expressions to strings, and for pretty printing them - it helps to be able to do something with what we're building.

haskell - src/Lambda/Simple/Expr.hs

module Lambda.Simple.Expr where

import Prelude

type Name = String

-- The expression data type

data Expr
    = Var Name
    | Lam Name Expr
    | App Expr Expr
    deriving (Show, Eq)

-- Encoding to string

encode :: Expr -> String
encode = go where

    go (Var n) = n
    go (Lam n b@(Lam _ _)) = "\\ " ++ n ++ tail (go b)
    go (Lam n b) = "\\ " ++ n ++ " . " ++ go b
    go (App e@(App _ _) f) = go e ++ " " ++ wrap f
    go (App e f) = wrap e ++ " " ++ wrap f

    wrap x@(Var _) = go x
    wrap x@(Lam _ _) = "(" ++ go x ++ ")"
    wrap x@(App _ _) = "(" ++ go x ++ ")"

-- Pretty printing

putExpr :: Expr -> IO ()
putExpr = putStr . encode

putExprLn :: Expr -> IO ()
putExprLn = putStrLn . encode

Lambda.Simple.Parse

So how do we go about actually using this data type? We could construct it manually, of course - we could define the identity function id = Lam "x" (Var "x"). But that is rather tedious. It is better to build a parser!

Having a parser serves a few purposes: first, it lets us use a better format for our lambda calculus code (\ x y . x is easier to read and write than Lam "x" (Lam "y" (Var "x"))) and secondly, it lets us write lambda calculus outside of Haskell, which comes in handy when we're building our interpreter later.

We will use the parsec library that we pulled in earlier. parsec is a parser-combinator library, which lets you use smaller parsers to build up a larger parser, and can result in parsers with similar shapes to the data type they are parsing. Since our Expr type is so small, it isn't hard to knock together a parser.

First, we build a parser for parsing Names, which is just captures an alphanumeric string. Then, a Var parser just wraps a constructor around a Name. Lam is a tad more complex, as it looks for one or more Names between a \ and a . followed by an Expr.

The parser for App is the harder bit to unravel. Naively, we could just say that an App is an Expr followed by a space and another Expr, and then use that to build our final parser with something like expr = var | lam | app, which looks rather like our data type. However, if we do it that way, our parser will loop infinitely, because to parse an App, we start by parsing an Expr, which could be an App that needs more parsing... Yikes! Rather than let this happen, we cut the loops, and handle App separately from Var and Lam. We do this by using chainl1 which lets us say that we have a series of Expr with App between them.

As you can see, our expression parser itself fits in just 14 lines - the rest is imports, and a few convenience functions for running our parser. Despite the small size, it parses parenthesis and spaces properly, and even handles multiple lambda arguments (so we can say \ x y z . ... instead of \ x . \ y . \ z . ...). Not bad at all.

With the parser out of the way, we should be able to run parse "\\ x y . x" and get Lam "x" (Lam "y" (Var "x")). (Note the double-slash - because it is a quoted haskell String that uses backslash to escape, we need to use double backslash with the first escaping the second.)

Neat! Now we can build expressions; that means it's time to evaluate them.

haskell - src/Lambda/Simple/Parse.hs

module Lambda.Simple.Parse where

import Prelude

import Data.Either
import Data.Functor

import Control.Monad

import Lambda.Simple.Expr

import qualified Text.Parsec as Parsec
import Text.Parsec hiding (parse, runParser)

runParser str = Parsec.runParser (expr <* eof) () "" str

parse str = case runParser str of
    (Right r)   -> r
    (Left e)    -> error $ show e

expr :: (Stream s m Char) => ParsecT s u m Expr
expr = chainl1 (nonApp <* spaces) (pure App) <?> "expr" where
    name = do
        head <- letter
        tail <- many (letter <|> digit <|> char '\'')
        (return (head:tail)) <?> "name"
    var = (Var <$> name) <?> "var"
    lam = do
        ns <- char '\\' >> spaces >> sepEndBy1 name spaces
        e <- char '.' >> spaces >> expr
        (return (foldl1 (.) (fmap Lam ns) e)) <?> "lam"
    nonApp = (parens expr <|> lam <|> var) <?> "non-app expr"
    parens = between
        (char '(' >> spaces)
        (char ')' >> spaces)

Lambda.Simple.Eval

So far, we've had to do any evaluation manually, just as we had to do the building of expressions manually before we constructed our parser. We know that the evaluation of (\ x y . x) first second yields first, and that the evaluation of (\ x y . y) first second yields second. We're just going to implement this logic to do this automatically.

To start, we've added a few NameSupply definitions. We use this when we perform alpha conversion, which is what we do if we have a naming collision because we accidentally used the same variable name for two different things.

Then we have a bunch of stuff about FreeVars. A variable is either bound - that is, named in a lambda abstraction - or it is free, meaning it references. A lambda expression can only be evaluated if it contains no free vars, or equivalently, if we can provide the matching variables. So in the expression \ x y . x z, both x and y are bound (even though only x is used), while z is free. We can turn a free variable into a bound one by wrapping the full expression in a lambda abstraction with a matching name. This is where we might have naming collisions, because the free-variable-value's expression might contain variable names that are already in use in the full expression. All this FreeVars stuff helps us detect and ameliorate this.

Once we get to the eval function, it is rather small. It just recurses, beta reducing and calling itself until it stops. So what is beta reduction?

Beta reduction is where, if we have a function application with a lambda abstraction as the first of its two expressions, we can go ahead and give the lambda abstraction the second expression as its argument, substituting the expression for the matching variable in the expression. We use alpha conversion in the case of the earlier mentioned naming conflict. And so we have functions subst alpha and beta.

That's it! Really! We can test it out. A simple test of eval . parse $ "(\ x y . x) first second" should return Var first

If we take the church numerals 2 = \ s z . s (s z) and 3 = \ s z . s (s (s z)), and supply them as arguments to the function add = \ m n s z . m s (n s z), we get (\ m n s z . m s (n s z)) (\ s z . s (s z)) (\ s z . s (s (s z))) which when evaluated repeatedly to normal form yields \ s z . s (s (s (s (s z)))) which when provided one and zero as arguments, results in what is obviously five: one (one (one (one (one zero))))

Note: We'll take care of that normal form stuff in the future revision of this article. For now we'll allow this limitation in the name of a shorter article.

All together, ignoring whitespace, comments, and type hints, the evaluator is just shy of 50 lines of code, with the core of it being the 17 lines for eval subst alpha and beta.

haskell - src/Lambda/Simple/Eval.hs

module Lambda.Simple.Eval where

import Prelude
import Data.Bool
import Data.Either

import Data.Set (Set)
import qualified Data.Set as Set

import Control.Monad

import Lambda.Simple.Expr

-- Name supplies

type NameSupply = [Name]

-- a0,b0,...,z0,a1,b1,...
digitEndian :: NameSupply
digitEndian = zipWith (\i a -> a:(show $ div i 26)) [0..] . concat . repeat $ ['a'..'z']

-- a,b,...,z,aa,ab,...,zz,aaa,...
bigEndian :: NameSupply
bigEndian = [1..] >>= flip replicateM ['a'..'z']

-- a,b,...,z,aa,ba,...,za,ab,bb,...
littleEndian :: NameSupply
littleEndian = [ v:vs | vs <- [] : littleEndian, v <- ['a'..'z'] ]

-- Free Vars

type FreeVars = Set Name

freeIn :: Name -> Expr -> Bool
freeIn n (Var m)                  = n == m
freeIn n (Lam m b)    | n == m    = False
freeIn n (Lam m b)    | n /= m    = freeIn n b
freeIn n (App x y)                = freeIn n x || freeIn n y

-- NOTE: Observe how this mirrors freeIn
freeVars :: Expr -> FreeVars
freeVars (Var n)    = Set.singleton n
freeVars (Lam n b)  = Set.filter (/= n) (freeVars b)
freeVars (App e f)  = Set.union (freeVars e) (freeVars f)

extendFree :: Name -> FreeVars -> FreeVars
extendFree = Set.insert 

freeVarList :: [Name] -> FreeVars
freeVarList = Set.fromList

isFree :: Name -> FreeVars -> Bool
isFree = Set.member

isFreeIn :: FreeVars -> Name -> Bool
isFreeIn = flip isFree

freeName :: NameSupply -> FreeVars -> Name
freeName supply vars = head $ filter (not . isFreeIn vars) supply

-- Eval

eval :: Expr -> Expr
eval (Lam n e) = Lam n (eval e)
eval (App f a) = applyWhen (isLam f) eval $ beta (eval f) (eval a)
eval e = e

-- Substitute (n)ame with (e)xpr in (b)ody [of Lam]
--  subst n E B ~> B[n := E]
subst :: Name -> Expr -> Expr -> Expr
subst n e (Var m)   | n == m    = e
subst n e (Lam m b) | n /= m    = Lam m (subst n e b)
subst n e (App f a)             = App (subst n e f) (subst n e a)
subst _ _ b                     = b

-- Alpha conversion / reduction
alpha :: Set Name -> Expr -> Expr
alpha vars (Lam n b) | isFreeIn vars n = Lam m b'' where
    m   = freeName digitEndian vars
    b'  = subst n (Var m) b
    b'' = (alpha (extendFree m vars) b')
alpha vars (Lam n b) = Lam n (alpha vars b)
alpha _ e = e

-- Beta reduction
beta :: Expr -> Expr -> Expr
beta (App f a) b = App (beta f a) b
beta (Lam n b) e = subst n e (alpha (freeVars e) b)
beta f a = App f a

-- Helpers

isLam :: Expr -> Bool
isLam (Lam _ _) = True
isLam _         = False

applyWhen :: Bool -> (a -> a) -> a -> a
applyWhen True f x = f x
applyWhen _    _ x = x

Lambda.Simple.Repl

Up to this point, we've been building our expression evaluator while using cabal repl / ghci to handle input and output. This is great for development, but to run our interpreter outside of cabal repl / ghci, we'll have to supply our own repl. Thankfully, this is not that difficult. We're going to first build our own standard READ-EVAL-PRINT-LOOP, just to show how, and then we'll quickly replace it with one using the repline package, which provides a nice shell experience with almost no work.

haskell - src/Lambda/Simple/Repl.hs

module Lambda.Simple.Repl where

import Prelude hiding (read, print)
import System.IO hiding (read, print)

import Lambda.Simple.Eval
import Lambda.Simple.Expr
import Lambda.Simple.Parse

import Control.Monad.IO.Class

import System.Console.Repline hiding (ReplOpts(..))

-- The simple repl

repl :: IO ()
repl = do
    putStr "> "
    hFlush stdout
    input <- getLine
    case input of
        (':':cmd) -> do
            case cmd of
                "q" -> return ()
                _   -> do
                    putStrLn $ "Unrecognized command: " ++ cmd
                    hFlush stdout
                    repl
        _ -> do
            let e = parse input
                p = eval e
            putExprLn p
            repl

-- The fancy repline interpreter

type Repl a = HaskelineT IO a

banner :: MultiLine -> Repl String
banner SingleLine = return "> "
banner MultiLine  = return "| "

command :: String -> Repl ()
-- command x = liftIO $ putStrLn $ "Command was: " ++ x
command input = liftIO $ putExprLn p where
    e = parse input
    p = eval e

options :: [(String, String -> Repl ())] 
options =
    [   ("r", const $ liftIO $ putStrLn "Reloading...")
    ,   ("q", const $ abort)
    ,   ("?", const $ liftIO $ putStrLn
            "lci - lambda calculus interpreter\n\
            \Usage\n\
            \   Enter lambda calculus terms to evaluate them.\n\
            \   Example:\n\
            \       > (\\ x y . x) first second\n\
            \       first\n\
            \Option:\n\
            \   :r - reload\n\
            \   :q - quit\n\
            \   :? - display this help\n\
            \Done."
        )
    ]

sigil :: Maybe Char
sigil = Just ':'

multiLineSigil :: Maybe String
multiLineSigil = Nothing

completer :: CompleterStyle IO
completer = Word $ \ _ -> return []

initializer :: Repl ()
initializer = do
    liftIO $ putStrLn "Starting lci..."
    liftIO $ putStrLn "Enter ':?' for help. Press Ctrl-D to quit."

finalizer :: Repl ExitDecision
finalizer = do
    liftIO $ putStrLn "Leaving lci."
    return Exit

interpreter :: IO ()
interpreter = evalRepl
    banner
    command
    options
    sigil
    multiLineSigil
    completer
    initializer
    finalizer

And that's it!

Stay tuned! We'll expand more on the lambda calculus interpreter in future articles

TODO: Add repo / source download links.

- Leo D., Undated, revised / edited / published Aug 2022