First, we need to decide on a syntax. Let’s start with Rust’s syntax:
fn frobnicate(foo: String, bar: String) -> Vec<Bar> {
...
}
Eldiro doesn’t have types (yet), so we need to remove those:1
fn frobnicate(foo, bar) {
...
}
Parentheses and commas are annoying, so let’s get rid of those:
fn frobnicate foo bar {
...
}
Often we have functions whose body is just a single statement. Here’s a contrived example:
fn add x y {
x + y
}
It can be convenient to not force function bodies to be blocks in these cases. Using the syntax from C#:
fn add x y => x + y
Since blocks are expressions (which are statements), we can also do this:
fn add x y => {
x + y
}
To reduce the number of ways to do one thing, we’ll remove the ‘block-only’ syntax, instead forcing the use of the =>
everywhere.
Why not use closures for everything?
In the past I would have preferred for all function definitions to be binding definitions whose value is a closure. Here’s an example in Rust:
// Function-specific syntax
fn add(x: i32, y: i32) -> i32 {
x + y
}
// More general closure syntax
let add = |x: i32, y: i32| -> i32 { x + y };
A comment from u/witty___name clarified that this approach is problematic, though. (I’ll be paraphrasing their explanation here.)
Eldiro doesn’t have any way to catch undefined functions until runtime at the moment, which means that this approach works fine, even with recursion. However, if Eldiro gains a way to check for undefined functions without running the program, then using closures for function definitions exclusively would lead to problems. Consider this closure definition in Rust:
let overflow_stack = || overflow_stack();
This doesn’t compile, since bindings aren’t in scope on their left-hand side. If they were, it would allow things like
let x = x;
which is clearly nonsensical.
Because of this, I’ve chosen for function definitions to have a dedicated syntax in Eldiro.
Parsing
Let’s get started on the parser. Function definitions aren’t an expression – they’re a statement, so let’s go to stmt.rs
and add a variant to that enum:
use crate::binding_def::BindingDef;
use crate::env::Env;
use crate::expr::Expr;
use crate::func_def::FuncDef;
use crate::val::Val;
// snip
#[derive(Debug, PartialEq)]
pub(crate) enum Stmt {
BindingDef(BindingDef),
FuncDef(FuncDef),
Expr(Expr),
}
To get our project compiling as quickly as possible, let’s add a case for Stmt::FuncDef
to Stmt::eval
:
impl Stmt {
// snip
pub(crate) fn eval(&self, env: &mut Env) -> Result<Val, String> {
match self {
Self::BindingDef(binding_def) => {
binding_def.eval(env)?;
Ok(Val::Unit)
}
Self::FuncDef(_) => todo!(),
Self::Expr(expr) => expr.eval(env),
}
}
}
Let’s add a test, too:
#[cfg(test)]
mod tests {
use super::*;
use crate::expr::{BindingUsage, Number, Op};
// snip
#[test]
fn parse_func_def() {
assert_eq!(
Stmt::new("fn identity x => x"),
Ok((
"",
Stmt::FuncDef(FuncDef {
name: "identity".to_string(),
params: vec!["x".to_string()],
body: Stmt::Expr(Expr::BindingUsage(BindingUsage {
name: "x".to_string(),
})),
}),
)),
);
}
// snip
}
BindingUsage
isn’t re-exported from crate::expr
; let’s fix that. We’ll also re-export Block
for consistency:
// expr.rs
mod binding_usage;
mod block;
pub(crate) use binding_usage::BindingUsage;
pub(crate) use block::Block;
use crate::env::Env;
use crate::utils;
use crate::val::Val;
The field name
on BindingUsage
is pub(super)
, when it should be pub(crate)
so we can use it for our tests:
// binding_usage.rs
#[derive(Debug, PartialEq)]
pub(crate) struct BindingUsage {
pub(crate) name: String,
}
We now need to create crate::func_def::FuncDef
. Let’s start by declaring and creating the module:
// lib.rs
mod binding_def;
mod env;
mod expr;
mod func_def; // new!
mod stmt;
mod utils;
mod val;
Open src/func_def.rs
and declare FuncDef
:
use crate::stmt::Stmt;
#[derive(Debug, PartialEq)]
pub(crate) struct FuncDef {
pub(crate) name: String,
pub(crate) params: Vec<String>,
pub(crate) body: Stmt,
}
Let’s try compiling Eldiro:
$ cargo c
error[E0072]: recursive type `func_def::FuncDef` has infinite size
--> crates/eldiro/src/func_def.rs:4:1
|
4 | pub(crate) struct FuncDef {
| ^^^^^^^^^^^^^^^^^^^^^^^^^ recursive type has infinite size
...
7 | pub(crate) body: Stmt,
| ---- recursive without indirection
|
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to make `func_def::FuncDef` representable
|
7 | pub(crate) body: Box<Stmt>,
| ^^^^ ^
error[E0072]: recursive type `stmt::Stmt` has infinite size
--> crates/eldiro/src/stmt.rs:8:1
|
8 | pub(crate) enum Stmt {
| ^^^^^^^^^^^^^^^^^^^^ recursive type has infinite size
9 | BindingDef(BindingDef),
10 | FuncDef(FuncDef),
| ------- recursive without indirection
|
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to make `stmt::Stmt` representable
|
10 | FuncDef(Box<FuncDef>),
| ^^^^ ^
error[E0391]: cycle detected when computing drop-check constraints for `stmt::Stmt`
--> crates/eldiro/src/stmt.rs:8:1
|
8 | pub(crate) enum Stmt {
| ^^^^^^^^^^^^^^^^^^^^
|
note: ...which requires computing drop-check constraints for `func_def::FuncDef`...
--> crates/eldiro/src/func_def.rs:4:1
|
4 | pub(crate) struct FuncDef {
| ^^^^^^^^^^^^^^^^^^^^^^^^^
= note: ...which again requires computing drop-check constraints for `stmt::Stmt`, completing the cycle
note: cycle used when computing drop-check constraints for `Parse`
--> crates/eldiro/src/lib.rs:13:1
|
13 | pub struct Parse(stmt::Stmt);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
error: aborting due to 3 previous errors
Whoa! We have a bunch of errors because we’ve accidentally created a recursive type – a type that contains itself. Rust doesn’t like this because, when it’s calculating the size of Stmt
, it falls into the following loop:
- To calculate
Stmt
’s size, I need to find the size ofStmt
’s largest variant. - To calculate
Stmt::BindingDef
’s size, I need to find the size of all its fields. - Calculates the size of all of
BindingDef
’s fields. - To calculate
Stmt::FuncDef
’s size, I need to find the size of all its fields. - To calculate the size of the
body
field ofFuncDef
, I need to find the size ofStmt
. - Go to 1.
In other words – we have a recursive type because Stmt
could contain a FuncDef
depending on which variant it is, and FuncDef
in turn contains a Stmt
through the body
field. The Rust compiler very kindly suggests how to remove the recursion; namely, add indirection somewhere. I’ll opt for Box
ing the body
of FuncDef
:
use crate::stmt::Stmt;
#[derive(Debug, PartialEq)]
pub(crate) struct FuncDef {
pub(crate) name: String,
pub(crate) params: Vec<String>,
pub(crate) body: Box<Stmt>,
}
A Box<Stmt>
is a pointer to some memory on the heap that contains a Stmt
. Rust’s process for calculating the size of Stmt
now goes like this:
- To calculate
Stmt
’s size, I need to find the size ofStmt
’s largest variant. - To calculate
Stmt::BindingDef
’s size, I need to find the size of all its fields. - Calculates the size of all of
BindingDef
’s fields. - To calculate
Stmt::FuncDef
’s size, I need to find the size of all its fields. - To calculate the size of the
name
field ofFuncDef
, I need to find the size ofString
(it’s 24 bytes). - To calculate the size of the
params
field ofFuncDef
, I need to find the size ofVec
(it’s 24 bytes). - To calculate the size of the
body
field ofFuncDef
, I need to find the size ofBox
(depends on your architecture, e.g. on a 64-bit system it’s 8 bytes). - To calculate
Stmt::Expr
’s size, I need to find the size ofExpr
. - …
No more recursion, so Rust doesn’t get stuck when trying to calculate Stmt
’s size.
We now need to update that test we wrote earlier:
// stmt.rs
#[cfg(test)]
mod tests {
// snip
#[test]
fn parse_func_def() {
assert_eq!(
Stmt::new("fn identity x => x"),
Ok((
"",
Stmt::FuncDef(FuncDef {
name: "identity".to_string(),
params: vec!["x".to_string()],
body: Box::new(Stmt::Expr(Expr::BindingUsage(BindingUsage {
name: "x".to_string(),
}))),
}),
)),
);
}
// snip
}
Let’s write another test for parsing a function definition, but this time in func_def.rs
:
#[cfg(test)]
mod tests {
use super::*;
use crate::expr::{Block, Expr};
#[test]
fn parse_func_def_with_no_params_and_empty_body() {
assert_eq!(
FuncDef::new("fn nothing => {}"),
Ok((
"",
FuncDef {
name: "nothing".to_string(),
params: Vec::new(),
body: Box::new(Stmt::Expr(Expr::Block(Block { stmts: Vec::new() }))),
},
)),
);
}
}
We need to make the stmts
field of Block
pub(crate)
:
// block.rs
#[derive(Debug, PartialEq)]
pub(crate) struct Block {
pub(crate) stmts: Vec<Stmt>,
}
Let’s implement FuncDef::new
:
// func_def.rs
impl FuncDef {
pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
let s = utils::tag("fn", s)?;
let (s, _) = utils::extract_whitespace1(s)?;
let (s, name) = utils::extract_ident(s)?;
let (s, _) = utils::extract_whitespace(s);
let s = utils::tag("=>", s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, body) = Stmt::new(s)?;
Ok((
s,
Self {
name: name.to_string(),
params: Vec::new(),
body: Box::new(body),
},
))
}
}
We should run the tests to see how we went:
$ cargo t -q
running 48 tests
..........................................F.....
failures:
---- stmt::tests::parse_func_def stdout ----
thread 'stmt::tests::parse_func_def' panicked at 'assertion failed: `(left == right)`
left: `Ok((" identity x => x", Expr(BindingUsage(BindingUsage { name: "fn" }))))`,
right: `Ok(("", FuncDef(FuncDef { name: "identity", params: ["x"], body: Expr(BindingUsage(BindingUsage { name: "x" })) })))`', crates/eldiro/src/stmt.rs:54:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
stmt::tests::parse_func_def
test result: FAILED. 47 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out
Looks like the parse_func_def_with_no_params_and_empty_body
test passed, but the test we wrote in stmt.rs
is failing. To make the test pass, we need to handle parameters:
impl FuncDef {
pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
let s = utils::tag("fn", s)?;
let (s, _) = utils::extract_whitespace1(s)?;
let (s, name) = utils::extract_ident(s)?;
let (s, _) = utils::extract_whitespace(s);
let mut s = s;
let mut params = Vec::new();
while let Ok((new_s, param)) = utils::extract_ident(s) {
s = new_s;
params.push(param.to_string());
let (new_s, _) = utils::extract_whitespace(s);
s = new_s;
}
let s = utils::tag("=>", s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, body) = Stmt::new(s)?;
Ok((
s,
Self {
name: name.to_string(),
params,
body: Box::new(body),
},
))
}
}
We also use this approach of ‘keep extracting something (in this case identifiers) and stuff them into a Vec
until you can’t continue’ in Block::new
. Let’s see if it works:
$ cargo t -q
running 48 tests
..........................................F.....
failures:
---- stmt::tests::parse_func_def stdout ----
thread 'stmt::tests::parse_func_def' panicked at 'assertion failed: `(left == right)`
left: `Ok((" identity x => x", Expr(BindingUsage(BindingUsage { name: "fn" }))))`,
right: `Ok(("", FuncDef(FuncDef { name: "identity", params: ["x"], body: Expr(BindingUsage(BindingUsage { name: "x" })) })))`', crates/eldiro/src/stmt.rs:55:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
stmt::tests::parse_func_def
test result: FAILED. 47 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out
Hmm, I thought we’d fixed it. thinking emoji …
Ah, I figured it out! The test that’s failing is using Stmt::new
, which we haven’t updated to try parsing a function definition. Here’s the current implementation:
// stmt.rs
impl Stmt {
pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
BindingDef::new(s)
.map(|(s, binding_def)| (s, Self::BindingDef(binding_def)))
.or_else(|_| Expr::new(s).map(|(s, expr)| (s, Self::Expr(expr))))
}
// snip
}
We need to add just one line to make the test pass:
impl Stmt {
pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
BindingDef::new(s)
.map(|(s, binding_def)| (s, Self::BindingDef(binding_def)))
.or_else(|_| FuncDef::new(s).map(|(s, func_def)| (s, Self::FuncDef(func_def))))
.or_else(|_| Expr::new(s).map(|(s, expr)| (s, Self::Expr(expr))))
}
// snip
}
Remember that technique we used in both FuncDef::new
and Block::new
to parse multiple things? Let’s extract that logic into a helper function in crate::utils
to remove the duplication and make the code easier to read:
pub(crate) fn sequence<T>(
parser: impl Fn(&str) -> Result<(&str, T), String>,
mut s: &str,
) -> Result<(&str, Vec<T>), String> {
let mut items = Vec::new();
while let Ok((new_s, item)) = parser(s) {
s = new_s;
items.push(item);
let (new_s, _) = extract_whitespace(s);
s = new_s;
}
Ok((s, items))
}
The function is generic over the type of the thing being parsed. It isn’t generic over the type of error the parser returns, though, because our project only has one type of parser error: String
. Take a look at this article to see more of the rationale behind this decision.
We can make use of our shiny new sequence
function in Block::new
:
// block.rs
impl Block {
pub(super) fn new(s: &str) -> Result<(&str, Self), String> {
let s = utils::tag("{", s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, stmts) = utils::sequence(Stmt::new, s)?;
let (s, _) = utils::extract_whitespace(s);
let s = utils::tag("}", s)?;
Ok((s, Block { stmts }))
}
// snip
}
and FuncDef::new
:
// func_def.rs
impl FuncDef {
pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
let s = utils::tag("fn", s)?;
let (s, _) = utils::extract_whitespace1(s)?;
let (s, name) = utils::extract_ident(s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, params) = utils::sequence(utils::extract_ident, s)?;
let s = utils::tag("=>", s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, body) = Stmt::new(s)?;
Ok((
s,
Self {
name: name.to_string(),
params,
body: Box::new(body),
},
))
}
}
Wait! That doesn’t compile, because params
is now of type Vec<&str>
instead of Vec<String>
. params
is a Vec<&str>
because the type that utils::extract_ident
returns (ignoring the usual leftover input &str
) is a &str
, which is picked up and propagated by the <T>
in utils::sequence
. We can fix this by passing utils::sequence
a wrapper closure around utils::extract_ident
that turns the identifier it extracts into a String
:
impl FuncDef {
pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
let s = utils::tag("fn", s)?;
let (s, _) = utils::extract_whitespace1(s)?;
let (s, name) = utils::extract_ident(s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, params) = utils::sequence(
|s| utils::extract_ident(s).map(|(s, ident)| (s, ident.to_string())),
s,
)?;
let s = utils::tag("=>", s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, body) = Stmt::new(s)?;
Ok((
s,
Self {
name: name.to_string(),
params,
body: Box::new(body),
},
))
}
}
And now this works perfectly:
$ cargo t -q
running 48 tests
................................................
test result: ok. 48 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Let’s write another test:
#[cfg(test)]
mod tests {
// snip
#[test]
fn parse_func_def_with_one_param_and_empty_body() {
assert_eq!(
FuncDef::new("fn greet name => {}"),
Ok((
"",
FuncDef {
name: "greet".to_string(),
params: vec!["name".to_string()],
body: Box::new(Stmt::Expr(Expr::Block(Block { stmts: Vec::new() }))),
},
)),
);
}
}
It passes:
$ cargo t -q
running 49 tests
.................................................
test result: ok. 49 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Let’s write a test with multiple parameters and a non-empty body:
#[cfg(test)]
mod tests {
use super::*;
use crate::expr::{BindingUsage, Block, Expr, Op};
// snip
#[test]
fn parse_func_def_with_multiple_params() {
assert_eq!(
FuncDef::new("fn add x y => x + y"),
Ok((
"",
FuncDef {
name: "add".to_string(),
params: vec!["x".to_string(), "y".to_string()],
body: Box::new(Stmt::Expr(Expr::Operation {
lhs: Expr::BindingUsage(BindingUsage {
name: "x".to_string()
}),
rhs: Expr::BindingUsage(BindingUsage {
name: "y".to_string()
}),
op: Op::Add
}))
}
))
);
}
}
Aaaand it doesn’t compile. It seems that the fields lhs
and rhs
of Expr::Operation
can only be crate::expr::Number
s. Although this does adhere to our initial design decision to disallow nested expressions, it prevents us from doing things as basic as defining an add
function! We need to fix this.
Allowing operations on expressions
It might be helpful to run git stash
– if you’re using Git – to temporarily hide the modifications we’ve made to existing files in our quest to add support for function definitions. Note that this will leave func_def.rs
intact.
Let’s remind ourselves of Expr
’s definition:
#[derive(Debug, PartialEq)]
pub(crate) enum Expr {
Number(Number),
Operation { lhs: Number, rhs: Number, op: Op },
BindingUsage(BindingUsage),
Block(Block),
}
Your first instinct might be to change that to
#[derive(Debug, PartialEq)]
pub(crate) enum Expr {
Number(Number),
Operation { lhs: Self, rhs: Self, op: Op },
BindingUsage(BindingUsage),
Block(Block),
}
Remember, though, that that would lead to the recursive type error we got earlier. We need to add Box
es:
#[derive(Debug, PartialEq)]
pub(crate) enum Expr {
Number(Number),
Operation {
lhs: Box<Self>,
rhs: Box<Self>,
op: Op,
},
BindingUsage(BindingUsage),
Block(Block),
}
Let’s start by fixing all the tests:
// binding_def.rs
#[cfg(test)]
mod tests {
// snip
#[test]
fn parse_binding_def() {
assert_eq!(
BindingDef::new("let a = 10 / 2"),
Ok((
"",
BindingDef {
name: "a".to_string(),
val: Expr::Operation {
lhs: Box::new(Expr::Number(Number(10))),
rhs: Box::new(Expr::Number(Number(2))),
op: Op::Div,
},
},
)),
);
}
// snip
}
// block.rs
#[cfg(test)]
mod tests {
// snip
#[test]
fn eval_block_with_multiple_exprs() {
assert_eq!(
Block {
stmts: vec![
Stmt::Expr(Expr::Number(Number(100))),
Stmt::Expr(Expr::Number(Number(30))),
Stmt::Expr(Expr::Operation {
lhs: Box::new(Expr::Number(Number(10))),
rhs: Box::new(Expr::Number(Number(7))),
op: Op::Sub,
}),
],
}
.eval(&Env::default()),
Ok(Val::Number(3)),
);
}
// snip
}
// expr.rs
#[cfg(test)]
mod tests {
// snip
#[test]
fn parse_one_plus_two() {
assert_eq!(
Expr::new("1+2"),
Ok((
"",
Expr::Operation {
lhs: Box::new(Expr::Number(Number(1))),
rhs: Box::new(Expr::Number(Number(2))),
op: Op::Add,
},
)),
);
}
#[test]
fn parse_expr_with_whitespace() {
assert_eq!(
Expr::new("2 * 2"),
Ok((
"",
Expr::Operation {
lhs: Box::new(Expr::Number(Number(2))),
rhs: Box::new(Expr::Number(Number(2))),
op: Op::Mul,
},
)),
);
}
// snip
#[test]
fn eval_add() {
assert_eq!(
Expr::Operation {
lhs: Box::new(Expr::Number(Number(10))),
rhs: Box::new(Expr::Number(Number(10))),
op: Op::Add,
}
.eval(&Env::default()),
Ok(Val::Number(20)),
);
}
#[test]
fn eval_sub() {
assert_eq!(
Expr::Operation {
lhs: Box::new(Expr::Number(Number(1))),
rhs: Box::new(Expr::Number(Number(5))),
op: Op::Sub,
}
.eval(&Env::default()),
Ok(Val::Number(-4)),
);
}
#[test]
fn eval_mul() {
assert_eq!(
Expr::Operation {
lhs: Box::new(Expr::Number(Number(5))),
rhs: Box::new(Expr::Number(Number(6))),
op: Op::Mul,
}
.eval(&Env::default()),
Ok(Val::Number(30)),
);
}
#[test]
fn eval_div() {
assert_eq!(
Expr::Operation {
lhs: Box::new(Expr::Number(Number(200))),
rhs: Box::new(Expr::Number(Number(20))),
op: Op::Div,
}
.eval(&Env::default()),
Ok(Val::Number(10)),
);
}
// snip
}
// func_def.rs
#[cfg(test)]
mod tests {
// snip
#[test]
#[test]
fn parse_func_def_with_multiple_params() {
assert_eq!(
FuncDef::new("fn add x y => x + y"),
Ok((
"",
FuncDef {
name: "add".to_string(),
params: vec!["x".to_string(), "y".to_string()],
body: Box::new(Stmt::Expr(Expr::Operation {
lhs: Box::new(Expr::BindingUsage(BindingUsage {
name: "x".to_string(),
})),
rhs: Box::new(Expr::BindingUsage(BindingUsage {
name: "y".to_string(),
})),
op: Op::Add,
})),
},
)),
);
}
}
// stmt.rs
#[cfg(test)]
mod tests {
// snip
#[test]
fn parse_expr() {
assert_eq!(
Stmt::new("1+1"),
Ok((
"",
Stmt::Expr(Expr::Operation {
lhs: Box::new(Expr::Number(Number(1))),
rhs: Box::new(Expr::Number(Number(1))),
op: Op::Add,
}),
)),
);
}
// snip
}
Now that we’ve updated all those tests, it’s time to tackle the four outstanding compilation errors. Two of these are from parsing an Expr::Operation
, and the other two are from evaluating it.
Let’s fix the parsing first. Here’s how an Expr::Operation
is currently parsed:
// expr.rs
impl Expr {
// snip
fn new_operation(s: &str) -> Result<(&str, Self), String> {
let (s, lhs) = Number::new(s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, op) = Op::new(s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, rhs) = Number::new(s)?;
Ok((s, Self::Operation { lhs, rhs, op }))
}
// snip
}
As you can see, lhs
and rhs
are of type Number
, not of type Box<Expr>
. We can add the Box
part:
impl Expr {
// snip
fn new_operation(s: &str) -> Result<(&str, Self), String> {
let (s, lhs) = Number::new(s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, op) = Op::new(s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, rhs) = Number::new(s)?;
Ok((
s,
Self::Operation {
lhs: Box::new(lhs),
rhs: Box::new(rhs),
op,
},
))
}
// snip
}
Let’s try making lhs
and rhs
Expr
s by calling Expr::new
:
impl Expr {
// snip
fn new_operation(s: &str) -> Result<(&str, Self), String> {
let (s, lhs) = Self::new(s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, op) = Op::new(s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, rhs) = Self::new(s)?;
Ok((
s,
Self::Operation {
lhs: Box::new(lhs),
rhs: Box::new(rhs),
op,
},
))
}
// snip
}
This looks reasonable enough. Let’s run the tests:
$ cargo t -q
error[E0308]: mismatched types
--> crates/eldiro/src/expr.rs:89:21
|
89 | let Number(lhs) = lhs;
| ^^^^^^^^^^^ --- this expression has type `&std::boxed::Box<expr::Expr>`
| |
| expected struct `std::boxed::Box`, found struct `expr::Number`
|
= note: expected struct `std::boxed::Box<expr::Expr>`
found struct `expr::Number`
error[E0308]: mismatched types
--> crates/eldiro/src/expr.rs:90:21
|
90 | let Number(rhs) = rhs;
| ^^^^^^^^^^^ --- this expression has type `&std::boxed::Box<expr::Expr>`
| |
| expected struct `std::boxed::Box`, found struct `expr::Number`
|
= note: expected struct `std::boxed::Box<expr::Expr>`
found struct `expr::Number`
error: aborting due to 2 previous errors
Oh, yeah, we still have that compilation error. Let’s add a todo!()
and comment out the rest so we can get on with running our tests:
impl Expr {
// snip
pub(crate) fn eval(&self, env: &Env) -> Result<Val, String> {
match self {
Self::Number(Number(n)) => Ok(Val::Number(*n)),
Self::Operation { lhs, rhs, op } => {
todo!();
// let Number(lhs) = lhs;
// let Number(rhs) = rhs;
// let result = match op {
// Op::Add => lhs + rhs,
// Op::Sub => lhs - rhs,
// Op::Mul => lhs * rhs,
// Op::Div => lhs / rhs,
// };
// Ok(Val::Number(result))
}
Self::BindingUsage(binding_usage) => binding_usage.eval(env),
Self::Block(block) => block.eval(env),
}
}
}
$ cargo t -q
running 46 tests
.........F.F.FF.F.
thread 'binding_def::tests::parse_binding_def' has overflowed its stack
fatal runtime error: stack overflow
Oh, no. Oh, no no no. What’s happened here is that Expr::new_operation
has called Expr::new
, which calls Expr::new_operation
, and so forth. The solution to this is to create an intermediary function that tries parsing all the possibilities apart from Expr::Operation
. We use this function to parse lhs
and rhs
, and use it and Expr::new_operation
in Expr::new
. It’s difficult to explain with words; take look at the code:
impl Expr {
pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
Self::new_operation(s).or_else(|_| Self::new_non_operation(s))
}
fn new_non_operation(s: &str) -> Result<(&str, Self), String> {
Self::new_number(s)
.or_else(|_| {
BindingUsage::new(s)
.map(|(s, binding_usage)| (s, Self::BindingUsage(binding_usage)))
})
.or_else(|_| Block::new(s).map(|(s, block)| (s, Self::Block(block))))
}
// snip
fn new_operation(s: &str) -> Result<(&str, Self), String> {
let (s, lhs) = Self::new_non_operation(s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, op) = Op::new(s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, rhs) = Self::new_non_operation(s)?;
Ok((
s,
Self::Operation {
lhs: Box::new(lhs),
rhs: Box::new(rhs),
op,
},
))
}
// snip
}
$ cargo t -q
running 46 tests
.............F.F.F.F..F.......................
failures:
---- expr::block::tests::eval_block_with_multiple_exprs stdout ----
thread 'expr::block::tests::eval_block_with_multiple_exprs' panicked at 'not yet implemented', crates/eldiro/src/expr.rs:92:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---- expr::tests::eval_add stdout ----
thread 'expr::tests::eval_add' panicked at 'not yet implemented', crates/eldiro/src/expr.rs:92:17
---- expr::tests::eval_div stdout ----
thread 'expr::tests::eval_div' panicked at 'not yet implemented', crates/eldiro/src/expr.rs:92:17
---- expr::tests::eval_mul stdout ----
thread 'expr::tests::eval_mul' panicked at 'not yet implemented', crates/eldiro/src/expr.rs:92:17
---- expr::tests::eval_sub stdout ----
thread 'expr::tests::eval_sub' panicked at 'not yet implemented', crates/eldiro/src/expr.rs:92:17
failures:
expr::block::tests::eval_block_with_multiple_exprs
expr::tests::eval_add
expr::tests::eval_div
expr::tests::eval_mul
expr::tests::eval_sub
test result: FAILED. 41 passed; 5 failed; 0 ignored; 0 measured; 0 filtered out
Ah, those failures are from the todo!()
in Expr::eval
. Regardless, all the parsers are passing, which means that we’ve managed to fix both of our issues at once: we no longer have a stack overflow, and we have also prevented nested operations (since lhs
and rhs
call Expr::new_non_operation
, there’s no chance that they could be Expr::Operation
s themselves).
Here’s the current state of Expr::eval
:
impl Expr {
// snip
pub(crate) fn eval(&self, env: &Env) -> Result<Val, String> {
match self {
Self::Number(Number(n)) => Ok(Val::Number(*n)),
Self::Operation { lhs, rhs, op } => {
todo!();
// let Number(lhs) = lhs;
// let Number(rhs) = rhs;
// let result = match op {
// Op::Add => lhs + rhs,
// Op::Sub => lhs - rhs,
// Op::Mul => lhs * rhs,
// Op::Div => lhs / rhs,
// };
// Ok(Val::Number(result))
}
Self::BindingUsage(binding_usage) => binding_usage.eval(env),
Self::Block(block) => block.eval(env),
}
}
}
To evaluate an Expr::Operation
, we must first evaluate the lhs
and the rhs
:
impl Expr {
// snip
pub(crate) fn eval(&self, env: &Env) -> Result<Val, String> {
match self {
Self::Number(Number(n)) => Ok(Val::Number(*n)),
Self::Operation { lhs, rhs, op } => {
let lhs = lhs.eval(env)?;
let rhs = rhs.eval(env)?;
let result = match op {
Op::Add => lhs + rhs,
Op::Sub => lhs - rhs,
Op::Mul => lhs * rhs,
Op::Div => lhs / rhs,
};
Ok(Val::Number(result))
}
Self::BindingUsage(binding_usage) => binding_usage.eval(env),
Self::Block(block) => block.eval(env),
}
}
}
We’ve got errors now because you can’t add, subtract, multiply or divide Val
s. What we want is for lhs
and rhs
to be integers. Before, when we knew that lhs
and rhs
were Numbers
, we could simply use pattern matching to extract the contained integer. Now, though, they could be any type that Val
can contain. Although at the moment this means that lhs
and rhs
could both either be Number
s or Unit
s, we’ll add strings and other data types in the future. To solve this issue, we need to check that lhs
and rhs
are indeed Val::Number
s, and if they aren’t return an error.
impl Expr {
// snip
pub(crate) fn eval(&self, env: &Env) -> Result<Val, String> {
match self {
Self::Number(Number(n)) => Ok(Val::Number(*n)),
Self::Operation { lhs, rhs, op } => {
let lhs = lhs.eval(env)?;
let rhs = rhs.eval(env)?;
let (lhs, rhs) = match (lhs,rhs) {
(Val::Number(lhs), Val::Number(rhs)) => (lhs, rhs),
_ => return Err("cannot evaluate operation whose left-hand side and right-hand side are not both numbers".to_string()),
};
let result = match op {
Op::Add => lhs + rhs,
Op::Sub => lhs - rhs,
Op::Mul => lhs * rhs,
Op::Div => lhs / rhs,
};
Ok(Val::Number(result))
}
Self::BindingUsage(binding_usage) => binding_usage.eval(env),
Self::Block(block) => block.eval(env),
}
}
}
Let’s write a test to verify that the error message kicks in at the right time:
#[cfg(test)]
mod tests {
// snip
#[test]
fn eval_non_number_operation() {
assert_eq!(
Expr::Operation {
lhs: Box::new(Expr::Number(Number(10))),
rhs: Box::new(Expr::Block(Block { stmts: Vec::new() })),
op: Op::Add,
}
.eval(&Env::default()),
Err("cannot evaluate operation whose left-hand side and right-hand side are not both numbers".to_string()),
);
}
// snip
}
$ cargo t -q
running 47 tests
...............................................
test result: ok. 47 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Run git stash pop
to get back the changes we made for function definitions. Let’s run the tests again to see if the new feature of Expr::Operation
is being used successfully:
$ cargo t -q
running 51 tests
...................................................
test result: ok. 51 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Beautiful! With that diversion complete, we can get back to testing the parsing of function definitions.
Back to parsing
So, what tests do we still need for the parsing? Well, we have one with no parameters and an empty body. We also have one with one parameter and an empty body, as well as one with multiple parameters and a non-empty body. Although we haven’t tested every possible combination, we’ve tested enough scenarios for me to be confident that the parser is working correctly.
Evaluation
Here’s the definition of Env
:
#[derive(Debug, PartialEq, Default)]
pub struct Env<'parent> {
bindings: HashMap<String, Val>,
parent: Option<&'parent Self>,
}
One way to store function definitions in Env
is to add another field, funcs
, and have its type be
HashMap<
String,
/* everything inside of FuncDef apart from name */,
>
Alternatively, you could have only one field for both bindings and functions. The type of this field is HashMap<String, V>
, where V is an enum that holds either a binding’s value, or information about a function. This way, you can’t ever have both a function and a binding with the same name – whichever one is declared last takes priority. This method also prevents duplication of the ‘recursively search parents’ logic we wrote originally for bindings.2
A nice place to start is removing the bindings
field and replacing it with a field called … well, it can contain bindings and functions … what do they both have in common? … names?
#[derive(Debug, PartialEq, Default)]
pub struct Env<'parent> {
named: HashMap<String, NamedInfo>,
parent: Option<&'parent Self>,
}
Let’s declare NamedInfo
:
// still in env.rs
#[derive(Debug, PartialEq)]
enum NamedInfo {
Binding(Val),
Func { params: Vec<String>, body: Stmt },
}
We need to update all of Env
’s methods – at the same time, we can also add methods for functions:
impl<'parent> Env<'parent> {
pub(crate) fn create_child(&'parent self) -> Self {
Self {
named: HashMap::new(),
parent: Some(self),
}
}
pub(crate) fn store_binding(&mut self, name: String, val: Val) {
self.named.insert(name, NamedInfo::Binding(val));
}
pub(crate) fn store_func(&mut self, name: String, params: Vec<String>, body: Stmt) {
self.named.insert(name, NamedInfo::Func { params, body });
}
pub(crate) fn get_binding(&self, name: &str) -> Result<Val, String> {
self.get_named_info(name)
.and_then(|named_info| match named_info {
NamedInfo::Binding(val) => Some(val),
_ => None,
})
.ok_or_else(|| format!("binding with name ‘{}’ does not exist", name))
}
pub(crate) fn get_func(&self, name: &str) -> Result<(Vec<String>, Stmt), String> {
self.get_named_info(name)
.and_then(|named_info| match named_info {
NamedInfo::Func { params, body } => Some((params, body)),
_ => None,
})
.ok_or_else(|| format!("function with name ‘{}’ does not exist", name))
}
fn get_named_info(&self, name: &str) -> Option<NamedInfo> {
self.named
.get(name)
.cloned()
.or_else(|| self.parent.and_then(|parent| parent.get_named_info(name)))
}
}
That code is a touch repetitive. Let‘s add some methods to NamedInfo
to make converting it into its variants easier:
impl NamedInfo {
fn into_binding(self) -> Option<Val> {
if let Self::Binding(val) = self {
Some(val)
} else {
None
}
}
fn into_func(self) -> Option<(Vec<String>, Stmt)> {
if let Self::Func { params, body } = self {
Some((params, body))
} else {
None
}
}
}
Let’s make use of that in Env::get_binding
and Env::get_func
:
impl<'parent> Env<'parent> {
// snip
pub(crate) fn get_binding(&self, name: &str) -> Result<Val, String> {
self.get_named_info(name)
.and_then(NamedInfo::into_binding)
.ok_or_else(|| format!("binding with name ‘{}’ does not exist", name))
}
pub(crate) fn get_func(&self, name: &str) -> Result<(Vec<String>, Stmt), String> {
self.get_named_info(name)
.and_then(NamedInfo::into_func)
.ok_or_else(|| format!("function with name ‘{}’ does not exist", name))
}
// snip
}
The first of the two compilation errors we have left is from us not being allowed to call .cloned()
in Env::get_named_info
, since NamedInfo
doesn’t implement Clone
. Let’s fix that:
#[derive(Debug, Clone, PartialEq)]
enum NamedInfo {
Binding(Val),
Func { params: Vec<String>, body: Stmt },
}
This doesn’t compile, since Stmt
doesn’t implement Clone
. I’ll leave you to go through all the ‘blah doesn’t implement Clone
’ errors, since adding #[derive(Clone)]
to almost all the types we’ve written so far is too laborious to write out here. As I did in the last article, here’s a link to the diff of the commit that adds this on GitHub in case you get lost.
We only have one error left; ‘no method named get_binding_value
’. Let’s fix it:
// binding_usage.rs
impl BindingUsage {
// snip
pub(super) fn eval(&self, env: &Env) -> Result<Val, String> {
env.get_binding(&self.name)
}
}
Now that all those shenanigans with Env
are done, we can finally implement evaluation of function definitions. Let‘s start with a test in stmt.rs
:
#[cfg(test)]
mod tests {
// snip
#[test]
fn eval_func_def() {
assert_eq!(
Stmt::FuncDef(FuncDef {
name: "always_return_one".to_string(),
params: Vec::new(),
body: Box::new(Stmt::Expr(Expr::Number(Number(1)))),
})
.eval(&mut Env::default()),
Ok(Val::Unit),
);
}
// snip
}
To stay consistent with crate::stmt::tests::eval_binding_def
, the test doesn’t examine the effect it’s had on the Env
– instead, it just checks that evaluating a function definition gives you a Unit
.
$ cargo t -q
running 52 tests
.............................................F......
failures:
---- stmt::tests::eval_func_def stdout ----
thread 'stmt::tests::eval_func_def' panicked at 'not yet implemented', crates/eldiro/src/stmt.rs:28:33
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
stmt::tests::eval_func_def
test result: FAILED. 51 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out
Let’s remove the todo!()
from Stmt::eval
, and fix this:
impl Stmt {
// snip
pub(crate) fn eval(&self, env: &mut Env) -> Result<Val, String> {
match self {
Self::BindingDef(binding_def) => {
binding_def.eval(env)?;
Ok(Val::Unit)
}
Self::FuncDef(func_def) => {
func_def.eval(env)?;
Ok(Val::Unit)
}
Self::Expr(expr) => expr.eval(env),
}
}
}
Let’s create an empty FuncDef::eval
so that we can run our tests:
// func_def.rs
use crate::env::Env;
use crate::stmt::Stmt;
use crate::utils;
// snip
impl FuncDef {
// snip
pub(crate) fn eval(&self, env: &mut Env) -> Result<(), String> {
Ok(())
}
}
$ cargo t -q
running 52 tests
....................................................
test result: ok. 52 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
We don’t need to write a test for evaluating function definitions, since the code to do so is trivial:3
impl FuncDef {
// snip
pub(crate) fn eval(&self, env: &mut Env) -> Result<(), String> {
env.store_func(self.name.clone(), self.params.clone(), *self.body.clone());
Ok(())
}
}
See you all next time, when we’ll add function calls to Eldiro.
The syntax highlighting has gone away because this isn’t Rust anymore, and my website doesn’t have Eldiro syntax highlighting. ↩︎
I spent almost an hour trying to refactor this logic out so that it can be reused in an attempt to use the first approach without copy-pasting. My first try at this used a macro, but the macro took up more code than copy-pasting the code. The same thing happened when I tried using a generic function instead. ↩︎
Famous last words. ↩︎