How to Write a Lisp Interpreter in Python: A Guide to Lispy
How to Write a Lisp Interpreter in Python: A Guide to Lispy
Implementing a Lisp interpreter provides a fundamental understanding of how computers and compilers work. Peter Norvig's Lispy is a concise implementation of a Scheme dialect written in Python 3, designed to demonstrate the "Maxwell's Equations of Software"—the core principles of language interpretation in their simplest form.
The Architecture of a Language Interpreter
A language interpreter consists of two primary components: parsing and execution.
1. Parsing
Parsing converts a sequence of characters (the program) into an internal representation, typically an Abstract Syntax Tree (AST). In Lispy, this process is split into two phases:
- Lexical Analysis (Tokenization): The input string is broken into tokens (parentheses, symbols, and numbers). Lispy achieves this by adding spaces around parentheses and using Python's
str.split. - Syntactic Analysis: The tokens are assembled into a nested list structure. The
read_from_tokensfunction recursively processes tokens; if it encounters an open parenthesis, it builds a list of sub-expressions until it finds the matching closing parenthesis.
2. Execution (Evaluation)
Execution processes the AST according to the semantic rules of the language. In Lispy, this is handled by the eval function, which determines the value of an expression based on its type (atom or list) and its specific form (e.g., a conditional or a procedure call).
Lispy Calculator: The Simplified Subset
To illustrate the basics, Lispy begins as a "calculator" supporting five basic syntactic forms:
| Expression | Syntax | Semantics |
|---|---|---|
| Variable Reference | symbol |
Returns the value of the variable from the environment. |
| Constant Literal | number |
Evaluates to the number itself. |
| Conditional | (if test conseq alt) |
Evaluates test; returns conseq if true, otherwise alt. |
| Definition | (define symbol exp) |
Binds the value of exp to symbol in the environment. |
| Procedure Call | (proc arg...) |
Evaluates proc and args, then applies the procedure to the values. |
Managing State with Environments
An environment is a mapping from variable names to values. Lispy uses a global environment containing standard procedures (e.g., sqrt, max, +, *).
To support local variables and nested scopes, Lispy defines Env as a subclass of Python's dict. Each environment can have an outer environment, creating a chain of scopes. When looking up a variable, the interpreter uses a find method to search the innermost environment first, moving outward to the global environment if the variable is not found. This mechanism implements lexical scoping.
Extending to Full Lispy
Full Lispy adds three critical special forms to move from a simple calculator to a nearly complete Scheme subset:
- Quotation (
quote): Returns the expression literally without evaluating it. - Assignment (
set!): Updates the value of a previously defined symbol in the environment. - Procedures (
lambda): Creates a new procedure object. AProcedureobject stores the parameter names, the body expression, and the environment in which it was defined.
When a lambda procedure is called, a new environment is created. This environment binds the passed arguments to the procedure's parameters and sets the procedure's definition environment as its outer scope.
Performance and Completeness
Lispy is designed for educational clarity rather than production use. Its characteristics include:
- Size: Extremely compact, consisting of approximately 117 non-comment lines of code.
- Speed: Capable of computing
(fact 100)in roughly 0.003 seconds. - Limitations: It lacks several standard Scheme features, including tail-call optimization,
call/cc, strings, characters, and comprehensive error recovery.
Community Insights and Perspectives
Industry practitioners and educators view the implementation of a Lisp as a rite of passage for programmers.
"I can't recommend highly enough to implement a simple lisp (or a forth). Illuminating experience and it will also help you see (among many other things) the parentheses in a different light."
Some developers have noted that while Lispy is an excellent starting point, it serves primarily as a bootstrap or educational tool. Others have pointed out the isomorphism between Lisp's tree-like structure and the phrase structure grammars used by linguists to annotate sentence structures.