As we did last time, we need to choose a syntax, this time for function calls.
// Rust syntax
f(a, b, c);
no_params();
Again, as I said in the previous part, parentheses and commas are annoying. Hence, Eldiro will use a different syntax in the spirit of ML-family languages:
f a b c
no_params
Much cleaner! This doesn’t syntactically distinguish between calling a function with no parameters and using a binding, though – we don’t know during parsing whether no_params
is a binding usage, or a function call with no parameters. We can solve this by looking at all the functions and bindings in scope during interpretation of a binding usage – if there’s a function with the name of the binding usage, we evaluate that with no parameters. If there isn’t, then we fall back to our usual strategy.
Parsing
We need a new module:
// expr.rs
mod binding_usage;
mod block;
mod func_call;
Let’s write our first test:
// src/expr/func_call.rs
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn parse_func_call_with_no_params() {
assert_eq!(
FuncCall::new("greet_user"),
Ok((
"",
FuncCall {
callee: "greet_user".to_string(),
params: Vec::new(),
},
)),
);
}
}
We need to define FuncCall
:
use super::Expr;
#[derive(Debug, Clone, PartialEq)]
pub(crate) struct FuncCall {
pub(crate) callee: String,
pub(crate) params: Vec<Expr>,
}
We also have to define FuncCall::new
. Let’s do the simplest thing that could make the test pass, short of hardcoding the output:
use crate::utils;
// snip
impl FuncCall {
pub(super) fn new(s: &str) -> Result<(&str, Self), String> {
let (s, callee) = utils::extract_ident(s)?;
Ok((
s,
Self {
callee: callee.to_string(),
params: Vec::new(),
},
))
}
}
$ cargo t -q
warning: associated function is never used: `get_func`
--> crates/eldiro/src/env.rs:33:19
|
33 | pub(crate) fn get_func(&self, name: &str) -> Result<(Vec<String>, Stmt), String> {
| ^^^^^^^^
|
= note: `#[warn(dead_code)]` on by default
warning: associated function is never used: `into_func`
--> crates/eldiro/src/env.rs:62:8
|
62 | fn into_func(self) -> Option<(Vec<String>, Stmt)> {
| ^^^^^^^^^
warning: associated function is never used: `new`
--> crates/eldiro/src/expr/func_call.rs:11:19
|
11 | pub(super) fn new(s: &str) -> Result<(&str, Self), String> {
| ^^^
warning: associated function is never used: `get_func`
--> crates/eldiro/src/env.rs:33:19
|
33 | pub(crate) fn get_func(&self, name: &str) -> Result<(Vec<String>, Stmt), String> {
| ^^^^^^^^
|
= note: `#[warn(dead_code)]` on by default
warning: associated function is never used: `into_func`
--> crates/eldiro/src/env.rs:62:8
|
62 | fn into_func(self) -> Option<(Vec<String>, Stmt)> {
| ^^^^^^^^^
warning: 3 warnings emitted
warning: 2 warnings emitted
running 53 tests
....................F................................
failures:
---- expr::func_call::tests::parse_func_call_with_no_params stdout ----
thread 'expr::func_call::tests::parse_func_call_with_no_params' panicked at 'assertion failed: `(left == right)`
left: `Ok(("_user", FuncCall { callee: "greet", params: [] }))`,
right: `Ok(("", FuncCall { callee: "greet_user", params: [] }))`', crates/eldiro/src/expr/func_call.rs:30:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
expr::func_call::tests::parse_func_call_with_no_params
test result: FAILED. 52 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out
Whoa, that’s a lot of warnings! That must be because we haven’t used FuncCall
anywhere. A function call is a kind of expression, so we should show that in the code:
// expr.rs
mod binding_usage;
mod block;
mod func_call;
pub(crate) use binding_usage::BindingUsage;
pub(crate) use block::Block;
pub(crate) use func_call::FuncCall;
// snip
#[derive(Debug, Clone, PartialEq)]
pub(crate) enum Expr {
Number(Number),
Operation {
lhs: Box<Self>,
rhs: Box<Self>,
op: Op,
},
FuncCall(FuncCall),
BindingUsage(BindingUsage),
Block(Block),
}
impl Expr {
// snip
fn new_non_operation(s: &str) -> Result<(&str, Self), String> {
Self::new_number(s)
.or_else(|_| FuncCall::new(s).map(|(s, func_call)| (s, Self::FuncCall(func_call))))
.or_else(|_| {
BindingUsage::new(s)
.map(|(s, binding_usage)| (s, Self::BindingUsage(binding_usage)))
})
.or_else(|_| Block::new(s).map(|(s, block)| (s, Self::Block(block))))
}
// snip
}
Let’s write a test to make sure that this works:
#[cfg(test)]
mod tests {
// snip
#[test]
fn parse_func_call() {
assert_eq!(
Expr::new("add 1 2"),
Ok((
"",
Expr::FuncCall(FuncCall {
callee: "add".to_string(),
params: vec![Expr::Number(Number(1)), Expr::Number(Number(2))],
}),
)),
);
}
// snip
}
Our project isn’t compiling, which means we can’t run our tests. We have to add a case to Expr::eval
for Expr::FuncCall
. For now, we can use the todo!()
macro so we can move on:
impl Expr {
// snip
pub(crate) fn eval(&self, env: &Env) -> Result<Val, String> {
match self {
Self::Number(Number(n)) => Ok(Val::Number(*n)),
Self::Operation { lhs, rhs, op } => {
// snip
}
Self::FuncCall(_) => todo!(), // here
Self::BindingUsage(binding_usage) => binding_usage.eval(env),
Self::Block(block) => block.eval(env),
}
}
}
$ cargo t -q
warning: associated function is never used: `get_func`
--> crates/eldiro/src/env.rs:33:19
|
33 | pub(crate) fn get_func(&self, name: &str) -> Result<(Vec<String>, Stmt), String> {
| ^^^^^^^^
|
= note: `#[warn(dead_code)]` on by default
warning: associated function is never used: `into_func`
--> crates/eldiro/src/env.rs:62:8
|
62 | fn into_func(self) -> Option<(Vec<String>, Stmt)> {
| ^^^^^^^^^
warning: associated function is never used: `get_func`
--> crates/eldiro/src/env.rs:33:19
|
33 | pub(crate) fn get_func(&self, name: &str) -> Result<(Vec<String>, Stmt), String> {
| ^^^^^^^^
|
= note: `#[warn(dead_code)]` on by default
warning: associated function is never used: `into_func`
--> crates/eldiro/src/env.rs:62:8
|
62 | fn into_func(self) -> Option<(Vec<String>, Stmt)> {
| ^^^^^^^^^
warning: 2 warnings emitted
warning: 2 warnings emitted
running 54 tests
......................FFF...F.....F.......F...........
failures:
---- expr::func_call::tests::parse_func_call_with_no_params stdout ----
thread 'expr::func_call::tests::parse_func_call_with_no_params' panicked at 'assertion failed: `(left == right)`
left: `Ok(("_user", FuncCall { callee: "greet", params: [] }))`,
right: `Ok(("", FuncCall { callee: "greet_user", params: [] }))`', crates/eldiro/src/expr/func_call.rs:30:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---- expr::block::tests::parse_block_with_multiple_stmts stdout ----
thread 'expr::block::tests::parse_block_with_multiple_stmts' panicked at 'assertion failed: `(left == right)`
left: `Ok(("", Block { stmts: [BindingDef(BindingDef { name: "a", val: Number(Number(10)) }), BindingDef(BindingDef { name: "b", val: FuncCall(FuncCall { callee: "a", params: [] }) }), Expr(FuncCall(FuncCall { callee: "b", params: [] }))] }))`,
right: `Ok(("", Block { stmts: [BindingDef(BindingDef { name: "a", val: Number(Number(10)) }), BindingDef(BindingDef { name: "b", val: BindingUsage(BindingUsage { name: "a" }) }), Expr(BindingUsage(BindingUsage { name: "b" }))] }))`', crates/eldiro/src/expr/block.rs:72:9
---- expr::tests::parse_binding_usage stdout ----
thread 'expr::tests::parse_binding_usage' panicked at 'assertion failed: `(left == right)`
left: `Ok(("", FuncCall(FuncCall { callee: "bar", params: [] })))`,
right: `Ok(("", BindingUsage(BindingUsage { name: "bar" })))`', crates/eldiro/src/expr.rs:201:9
---- expr::tests::parse_func_call stdout ----
thread 'expr::tests::parse_func_call' panicked at 'assertion failed: `(left == right)`
left: `Ok((" 1 2", FuncCall(FuncCall { callee: "add", params: [] })))`,
right: `Ok(("", FuncCall(FuncCall { callee: "add", params: [Number(Number(1)), Number(Number(2))] })))`', crates/eldiro/src/expr.rs:187:9
---- func_def::tests::parse_func_def_with_multiple_params stdout ----
thread 'func_def::tests::parse_func_def_with_multiple_params' panicked at 'assertion failed: `(left == right)`
left: `Ok(("", FuncDef { name: "add", params: ["x", "y"], body: Expr(Operation { lhs: FuncCall(FuncCall { callee: "x", params: [] }), rhs: FuncCall(FuncCall { callee: "y", params: [] }), op: Add }) }))`,
right: `Ok(("", FuncDef { name: "add", params: ["x", "y"], body: Expr(Operation { lhs: BindingUsage(BindingUsage { name: "x" }), rhs: BindingUsage(BindingUsage { name: "y" }), op: Add }) }))`', crates/eldiro/src/func_def.rs:83:9
---- stmt::tests::parse_func_def stdout ----
thread 'stmt::tests::parse_func_def' panicked at 'assertion failed: `(left == right)`
left: `Ok(("", FuncDef(FuncDef { name: "identity", params: ["x"], body: Expr(FuncCall(FuncCall { callee: "x", params: [] })) })))`,
right: `Ok(("", FuncDef(FuncDef { name: "identity", params: ["x"], body: Expr(BindingUsage(BindingUsage { name: "x" })) })))`', crates/eldiro/src/stmt.rs:58:9
failures:
expr::block::tests::parse_block_with_multiple_stmts
expr::func_call::tests::parse_func_call_with_no_params
expr::tests::parse_binding_usage
expr::tests::parse_func_call
func_def::tests::parse_func_def_with_multiple_params
stmt::tests::parse_func_def
test result: FAILED. 48 passed; 6 failed; 0 ignored; 0 measured; 0 filtered out
We have even more problems than before! What’s happened here is that all the binding usages in our tests are now being parsed as function calls with no parameters. Remember how I said earlier that a binding usage and a parameter-less function call are indistinguishable during parsing? That’s what we’re running into here.
We have two choices: abolish BindingUsage
in favour of special-casing the evaluation of function calls with one parameter, or we could alternatively only parse function calls with one or more parameters. The first results in less code but is more work to change, while the second takes more code but is an easy change. Since we’re making this project just to learn at the moment, I’ll go for the easy option: only accepting function calls with one or more parameters.
We need to update our test first:
// func_call.rs
#[cfg(test)]
mod tests {
use super::super::Number;
use super::*;
#[test]
fn parse_func_call_with_one_parameter() {
assert_eq!(
FuncCall::new("factorial 10"),
Ok((
"",
FuncCall {
callee: "factorial".to_string(),
params: vec![Expr::Number(Number(10))],
},
)),
);
}
}
We should adjust FuncCall::new
so that it extracts one parameter:
impl FuncCall {
pub(super) fn new(s: &str) -> Result<(&str, Self), String> {
let (s, callee) = utils::extract_ident(s)?;
let (s, _) = utils::extract_whitespace1(s)?;
let (s, param) = Expr::new(s)?;
Ok((
s,
Self {
callee: callee.to_string(),
params: vec![param],
},
))
}
}
$ cargo t -q
running 54 tests
.......................F.....F........................
failures:
---- expr::block::tests::parse_block_with_multiple_stmts stdout ----
thread 'expr::block::tests::parse_block_with_multiple_stmts' panicked at 'assertion failed: `(left == right)`
left: `Ok(("", Block { stmts: [BindingDef(BindingDef { name: "a", val: Number(Number(10)) }), BindingDef(BindingDef { name: "b", val: FuncCall(FuncCall { callee: "a", params: [BindingUsage(BindingUsage { name: "b" })] }) })] }))`,
right: `Ok(("", Block { stmts: [BindingDef(BindingDef { name: "a", val: Number(Number(10)) }), BindingDef(BindingDef { name: "b", val: BindingUsage(BindingUsage { name: "a" }) }), Expr(BindingUsage(BindingUsage { name: "b" }))] }))`', crates/eldiro/src/expr/block.rs:72:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---- expr::tests::parse_func_call stdout ----
thread 'expr::tests::parse_func_call' panicked at 'assertion failed: `(left == right)`
left: `Ok((" 2", FuncCall(FuncCall { callee: "add", params: [Number(Number(1))] })))`,
right: `Ok(("", FuncCall(FuncCall { callee: "add", params: [Number(Number(1)), Number(Number(2))] })))`', crates/eldiro/src/expr.rs:187:9
failures:
expr::block::tests::parse_block_with_multiple_stmts
expr::tests::parse_func_call
test result: FAILED. 52 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out
We’ll worry about the first failure later; the second one is from crate::expr::tests::parse_func_call
. This test fails because FuncCall::new
doesn’t accept function calls with more than one parameter. Let’s use utils::sequence
to remedy this:
impl FuncCall {
pub(super) fn new(s: &str) -> Result<(&str, Self), String> {
let (s, callee) = utils::extract_ident(s)?;
let (s, _) = utils::extract_whitespace1(s)?;
let (s, params) = utils::sequence(Expr::new, s)?;
Ok((
s,
Self {
callee: callee.to_string(),
params,
},
))
}
}
$ cargo t -q
running 54 tests
.......................F............F.................
failures:
---- expr::block::tests::parse_block_with_multiple_stmts stdout ----
thread 'expr::block::tests::parse_block_with_multiple_stmts' panicked at 'assertion failed: `(left == right)`
left: `Ok(("", Block { stmts: [BindingDef(BindingDef { name: "a", val: Number(Number(10)) }), BindingDef(BindingDef { name: "b", val: FuncCall(FuncCall { callee: "a", params: [FuncCall(FuncCall { callee: "b", params: [] })] }) })] }))`,
right: `Ok(("", Block { stmts: [BindingDef(BindingDef { name: "a", val: Number(Number(10)) }), BindingDef(BindingDef { name: "b", val: BindingUsage(BindingUsage { name: "a" }) }), Expr(BindingUsage(BindingUsage { name: "b" }))] }))`', crates/eldiro/src/expr/block.rs:72:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---- func_def::tests::parse_func_def_with_multiple_params stdout ----
thread 'func_def::tests::parse_func_def_with_multiple_params' panicked at 'assertion failed: `(left == right)`
left: `Ok(("", FuncDef { name: "add", params: ["x", "y"], body: Expr(Operation { lhs: FuncCall(FuncCall { callee: "x", params: [] }), rhs: BindingUsage(BindingUsage { name: "y" }), op: Add }) }))`,
right: `Ok(("", FuncDef { name: "add", params: ["x", "y"], body: Expr(Operation { lhs: BindingUsage(BindingUsage { name: "x" }), rhs: BindingUsage(BindingUsage { name: "y" }), op: Add }) }))`', crates/eldiro/src/func_def.rs:83:9
failures:
expr::block::tests::parse_block_with_multiple_stmts
func_def::tests::parse_func_def_with_multiple_params
test result: FAILED. 52 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out
That first test from before is still failing, and we did indeed fix crate::expr::tests::parse_func_call
. We managed to break function definition parsing, though. Here’s the test in question:
// func_def.rs
#[cfg(test)]
mod tests {
// snip
#[test]
fn parse_func_def_with_multiple_params() {
assert_eq!(
FuncDef::new("fn add x y => x + y"),
Ok((
"",
FuncDef {
name: "add".to_string(),
params: vec!["x".to_string(), "y".to_string()],
body: Box::new(Stmt::Expr(Expr::Operation {
lhs: Box::new(Expr::BindingUsage(BindingUsage {
name: "x".to_string(),
})),
rhs: Box::new(Expr::BindingUsage(BindingUsage {
name: "y".to_string(),
})),
op: Op::Add,
})),
},
)),
);
}
}
The x
in x + y
is being parsed as a function call with no parameters, rather than as a binding usage. Hmm, it must be because utils::sequence
accepts sequences of lengths zero. A nice way of solving this is to create a wrapper around sequence
that checks if the output is length zero, and, if it is, returns an error. First, though, we should make use of this hypothetical sequence1
in FuncCall::new
so we know immediately if our implementation is correct:
// func_call.rs
impl FuncCall {
pub(super) fn new(s: &str) -> Result<(&str, Self), String> {
let (s, callee) = utils::extract_ident(s)?;
let (s, _) = utils::extract_whitespace1(s)?;
let (s, params) = utils::sequence1(Expr::new, s)?;
Ok((
s,
Self {
callee: callee.to_string(),
params,
},
))
}
}
We can now define sequence1
:
// utils.rs
pub(crate) fn sequence1<T>(
parser: impl Fn(&str) -> Result<(&str, T), String>,
s: &str,
) -> Result<(&str, Vec<T>), String> {
let (s, sequence) = sequence(parser, s)?;
if sequence.is_empty() {
Err("expected a sequence with more than one item".to_string())
} else {
Ok((s, sequence))
}
}
$ cargo t -q
running 54 tests
........................F.............................
failures:
---- expr::block::tests::parse_block_with_multiple_stmts stdout ----
thread 'expr::block::tests::parse_block_with_multiple_stmts' panicked at 'assertion failed: `(left == right)`
left: `Ok(("", Block { stmts: [BindingDef(BindingDef { name: "a", val: Number(Number(10)) }), BindingDef(BindingDef { name: "b", val: FuncCall(FuncCall { callee: "a", params: [BindingUsage(BindingUsage { name: "b" })] }) })] }))`,
right: `Ok(("", Block { stmts: [BindingDef(BindingDef { name: "a", val: Number(Number(10)) }), BindingDef(BindingDef { name: "b", val: BindingUsage(BindingUsage { name: "a" }) }), Expr(BindingUsage(BindingUsage { name: "b" }))] }))`', crates/eldiro/src/expr/block.rs:72:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
expr::block::tests::parse_block_with_multiple_stmts
test result: FAILED. 53 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out
Here’s parse_block_with_multiple_stmts
:
// block.rs
#[cfg(test)]
mod tests {
// snip
#[test]
fn parse_block_with_multiple_stmts() {
assert_eq!(
Block::new(
"{
let a = 10
let b = a
b
}",
),
Ok((
"",
Block {
stmts: vec![
Stmt::BindingDef(BindingDef {
name: "a".to_string(),
val: Expr::Number(Number(10)),
}),
Stmt::BindingDef(BindingDef {
name: "b".to_string(),
val: Expr::BindingUsage(BindingUsage {
name: "a".to_string(),
}),
}),
Stmt::Expr(Expr::BindingUsage(BindingUsage {
name: "b".to_string(),
})),
],
},
)),
);
}
// snip
}
The
a
b
in
let b = a
b
is being parsed as a function call, rather than two separate entities (the first of these is the right-hand side of the binding definition, and the other is the final return value of the block). This is happening because utils::sequence
(and by extension utils::sequence1
) use utils::extract_whitespace
internally, which consumes both spaces and newlines. Let’s create a separate utils::extract_non_newline_whitespace
, and also allow for a custom separator in utils::sequence
and utils::sequence1
. That function name is a bit ridiculous though, and it will likely only be used once. Because of this I think it makes more sense to just pass in the function as a one-time closure to utils::sequence1
in FuncCall::new
.
Let’s make utils::sequence
and utils::sequence1
accept a custom separator first:
// utils.rs
pub(crate) fn sequence<T>(
parser: impl Fn(&str) -> Result<(&str, T), String>,
separator_parser: impl Fn(&str) -> (&str, &str),
mut s: &str,
) -> Result<(&str, Vec<T>), String> {
let mut items = Vec::new();
while let Ok((new_s, item)) = parser(s) {
s = new_s;
items.push(item);
let (new_s, _) = separator_parser(s);
s = new_s;
}
Ok((s, items))
}
pub(crate) fn sequence1<T>(
parser: impl Fn(&str) -> Result<(&str, T), String>,
separator_parser: impl Fn(&str) -> (&str, &str),
s: &str,
) -> Result<(&str, Vec<T>), String> {
let (s, sequence) = sequence(parser, separator_parser, s)?;
if sequence.is_empty() {
Err("expected a sequence with more than one item".to_string())
} else {
Ok((s, sequence))
}
}
We now need to update all the usages of utils::sequence
and utils::sequence1
to pass in the separators they use:
// block.rs
impl Block {
pub(super) fn new(s: &str) -> Result<(&str, Self), String> {
let s = utils::tag("{", s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, stmts) = utils::sequence(Stmt::new, utils::extract_whitespace, s)?;
let (s, _) = utils::extract_whitespace(s);
let s = utils::tag("}", s)?;
Ok((s, Block { stmts }))
}
// snip
}
// func_def.rs
impl FuncDef {
pub(crate) fn new(s: &str) -> Result<(&str, Self), String> {
let s = utils::tag("fn", s)?;
let (s, _) = utils::extract_whitespace1(s)?;
let (s, name) = utils::extract_ident(s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, params) = utils::sequence(
|s| utils::extract_ident(s).map(|(s, ident)| (s, ident.to_string())),
utils::extract_whitespace,
s,
)?;
let s = utils::tag("=>", s)?;
let (s, _) = utils::extract_whitespace(s);
let (s, body) = Stmt::new(s)?;
Ok((
s,
Self {
name: name.to_string(),
params,
body: Box::new(body),
},
))
}
// snip
}
And now finally FuncCall::new
, where we pass in a closure that accepts only spaces, not newlines. Let’s also adjust the extraction of whitespace after parsing out the callee so that it stays consistent:
// func_call.rs
impl FuncCall {
pub(super) fn new(s: &str) -> Result<(&str, Self), String> {
let (s, callee) = utils::extract_ident(s)?;
let (s, _) = utils::take_while(|c| c == ' ', s);
let (s, params) = utils::sequence1(Expr::new, |s| utils::take_while(|c| c == ' ', s), s)?;
Ok((
s,
Self {
callee: callee.to_string(),
params,
},
))
}
}
We’re using the same take_while
function we used for extract_whitespace
. The only thing left is to make take_while
pub(crate)
so that the crate::expr::func_call
module can use it:
// utils.rs
pub(crate) fn take_while(accept: impl Fn(char) -> bool, s: &str) -> (&str, &str) {
let extracted_end = s
.char_indices()
.find_map(|(idx, c)| if accept(c) { None } else { Some(idx) })
.unwrap_or_else(|| s.len());
let extracted = &s[..extracted_end];
let remainder = &s[extracted_end..];
(remainder, extracted)
}
$ cargo t -q
running 54 tests
......................................................
test result: ok. 54 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Evaluation
Let’s start with a test:
// expr.rs
#[cfg(test)]
mod tests {
// snip
#[test]
fn eval_func_call() {
let mut env = Env::default();
env.store_func(
"add".to_string(),
vec!["x".to_string(), "y".to_string()],
Stmt::Expr(Expr::Operation {
lhs: Box::new(Expr::BindingUsage(BindingUsage {
name: "x".to_string(),
})),
rhs: Box::new(Expr::BindingUsage(BindingUsage {
name: "y".to_string(),
})),
op: Op::Add,
}),
);
assert_eq!(
Expr::FuncCall(FuncCall {
callee: "add".to_string(),
params: vec![Expr::Number(Number(2)), Expr::Number(Number(2))],
})
.eval(&env),
Ok(Val::Number(4)),
);
}
// snip
}
We declare a function with the name add
that takes two parameters, x
and y
, and has a body consisting of the addition of x
and y
. We then evaluate a call to add
with the parameters 2
and 2
, expecting that the result will be 4
. Running this test panics due to the todo!()
in Expr::eval
that we added earlier. Let’s remove that, instead delegating to an as-yet unimplemented method on FuncCall
:
impl Expr {
// snip
pub(crate) fn eval(&self, env: &Env) -> Result<Val, String> {
match self {
Self::Number(Number(n)) => Ok(Val::Number(*n)),
Self::Operation { lhs, rhs, op } => {
// snip
}
Self::FuncCall(func_call) => func_call.eval(env),
Self::BindingUsage(binding_usage) => binding_usage.eval(env),
Self::Block(block) => block.eval(env),
}
}
}
How do you even evaluate a function call? The approach we’ll use is to create a new child environment from the environment being passed in, and add bindings for each of the parameters of the function call. After doing that we can evaluate the body of the function being called, knowing that the parameters used in the body are in the environment:
// func_call.rs
impl FuncCall {
// snip
pub(super) fn eval(&self, env: &Env) -> Result<Val, String> {
let mut child_env = env.create_child();
let (param_names, body) = env.get_func(&self.callee).unwrap();
for (param_name, param_expr) in param_names.into_iter().zip(&self.params) {
let param_val = param_expr.eval(&child_env)?;
child_env.store_binding(param_name, param_val);
}
body.eval(&mut child_env)
}
}
$ cargo t -q
running 55 tests
.......................................................
test result: ok. 55 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Great! Let’s add another test to make sure we haven’t made any mistakes:
// still in func_call.rs
#[cfg(test)]
mod tests {
use super::super::{BindingUsage, Number};
use super::*;
use crate::stmt::Stmt;
// snip
#[test]
fn eval_func_call() {
let mut env = Env::default();
env.store_func(
"id".to_string(),
vec!["x".to_string()],
Stmt::Expr(Expr::BindingUsage(BindingUsage {
name: "x".to_string(),
})),
);
assert_eq!(
FuncCall {
callee: "id".to_string(),
params: vec![Expr::Number(Number(10))],
}
.eval(&env),
Ok(Val::Number(10)),
);
}
}
$ cargo t -q
running 56 tests
........................................................
test result: ok. 56 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
There’s something missing, though: error handling. There are three possible errors that can occur only when evaluating a function call:
- non-existent function
- too few parameters
- too many parameters
First, we’ll add a test for the case where the function being called doesn’t exist:
#[cfg(test)]
mod tests {
// snip
#[test]
fn eval_non_existent_func_call() {
let env = Env::default();
assert_eq!(
FuncCall {
callee: "i_dont_exist".to_string(),
params: vec![Expr::Number(Number(1))],
}
.eval(&env),
Err("function with name ‘i_dont_exist’ does not exist".to_string()),
);
}
}
To make this pass, we need to convert the .unwrap()
in FuncCall::eval
to a ?
so it forwards on the error message Env
helpfully creates for us:
impl FuncCall {
// snip
pub(super) fn eval(&self, env: &Env) -> Result<Val, String> {
let mut child_env = env.create_child();
let (param_names, body) = env.get_func(&self.callee)?;
for (param_name, param_expr) in param_names.into_iter().zip(&self.params) {
let param_val = param_expr.eval(&child_env)?;
child_env.store_binding(param_name, param_val);
}
body.eval(&mut child_env)
}
}
Next up is matching the number of parameters in the function call with the function definition. Let’s write a test for each possibility:
#[cfg(test)]
mod tests {
use super::super::{BindingUsage, Number, Op};
use super::*;
use crate::stmt::Stmt;
// snip
#[test]
fn eval_func_call_with_too_few_parameters() {
let mut env = Env::default();
env.store_func(
"mul".to_string(),
vec!["a".to_string(), "b".to_string()],
Stmt::Expr(Expr::Operation {
lhs: Box::new(Expr::BindingUsage(BindingUsage {
name: "a".to_string(),
})),
rhs: Box::new(Expr::BindingUsage(BindingUsage {
name: "b".to_string(),
})),
op: Op::Mul,
}),
);
assert_eq!(
FuncCall {
callee: "mul".to_string(),
params: vec![Expr::Number(Number(100))],
}
.eval(&env),
Err("expected 2 parameters, got 1".to_string()),
);
}
#[test]
fn eval_func_call_with_too_many_parameters() {
let mut env = Env::default();
env.store_func(
"square".to_string(),
vec!["n".to_string()],
Stmt::Expr(Expr::Operation {
lhs: Box::new(Expr::BindingUsage(BindingUsage {
name: "n".to_string(),
})),
rhs: Box::new(Expr::BindingUsage(BindingUsage {
name: "n".to_string(),
})),
op: Op::Mul,
}),
);
assert_eq!(
FuncCall {
callee: "square".to_string(),
params: vec![Expr::Number(Number(5)), Expr::Number(Number(42))],
}
.eval(&env),
Err("expected 1 parameters, got 2".to_string()),
);
}
}
The code may not be pretty, but I guess it gets the job done. Here’s how we can make those two tests pass:
impl FuncCall {
// snip
pub(super) fn eval(&self, env: &Env) -> Result<Val, String> {
let mut child_env = env.create_child();
let (param_names, body) = env.get_func(&self.callee)?;
let num_expected_params = param_names.len();
let num_actual_params = self.params.len();
if num_expected_params != num_actual_params {
return Err(format!(
"expected {} parameters, got {}",
num_expected_params, num_actual_params,
));
}
for (param_name, param_expr) in param_names.into_iter().zip(&self.params) {
let param_val = param_expr.eval(&child_env)?;
child_env.store_binding(param_name, param_val);
}
body.eval(&mut child_env)
}
}
$ cargo t -q
running 59 tests
...........................................................
test result: ok. 59 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
We still haven’t implemented function calls with no parameters, though. Eldiro doesn’t currently have side effects of any kind, so we can’t test whether a parameter-less function call has taken place. Regardless, let’s implement the feature:
// binding_usage.rs
impl BindingUsage {
// snip
pub(super) fn eval(&self, env: &Env) -> Result<Val, String> {
env.get_binding(&self.name).or_else(|error_msg| {
if env.get_func(&self.name).is_ok() {
FuncCall {
callee: self.name.clone(),
params: Vec::new(),
}
.eval(env)
} else {
Err(error_msg)
}
})
}
}
What this is doing is first trying to obtain the binding usage just as we would do normally; however, if it cannot be found, the error message is saved, and, if the environment contains a function with the same name as the binding usage, the corresponding function call is constructed and evaluated. If this function doesn’t exist, then the error message from the binding usage is used. This prevents using a non-existent binding and getting an error message about a non-existent function instead.
Let’s run our test suite to see if we’ve broken anything in the process:
$ cargo t -q
running 59 tests
...........................................................
test result: ok. 59 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
And with that, this part is done. Good job on getting this far. If you like, you can stop here, and keep adding features to your implementation of Eldiro. If you want to continue following this series, read on.
Problems with the current implementation
The implementation of Eldiro we have been writing together has a number of problems that have either arisen due to
- my unwillingness to refactor throughout the series (why waste readers’ time doing menial refactoring when you could be explaining interesting topics?)
- a lack of planning, or
- the ‘worse’ way being easier to understand and teach
Here’s a list of everything ‘wrong’ with it at the moment:
- Parsing something requires going through all the options until one that parses successfully is found. This is slower than looking at the input and determining which option the input is representing (this is called predictive parsing), since the same part of the input is examined multiple times.
crate::expr::Number
andcrate::expr::BindingDef
should be ‘inlined’ intoExpr
’s definition rather than in their own structures, since their implementations are so short that making them separate results in too much boilerplate for the separation to be worth the trouble.Expr::new_number
could similarly be inlined intoExpr::new_non_operation
, which also makes it more consistent with how all the other non-Operation
variants ofExpr
are parsed.- Error messages don’t have location information.
- Parsing stops at the first error, rather than recovering and attempting to parse as much out of the input as possible.
- Operations can’t be nested (for example,
6 + 2 * 3
doesn’t parse). - Error messages are strings, rather than proper structured data.
The basic architecture that Eldiro is currently using is flawed – by working with the underlying text directly, it forces the language to re-examine the input again and again during parsing. This also makes it more difficult to recover from errors. Furthermore, if we ever want to implement static analysis (analysing code without running it; for example, a squiggly underline for an undefined binding in an editor), we need to be able to represent incomplete code. Take BindingDef
as an example. In its current form, both a name and a value have to be present:
#[derive(Debug, Clone, PartialEq)]
pub(crate) struct BindingDef {
pub(crate) name: String,
pub(crate) val: Expr,
}
Let’s say we’re in the middle of typing let a =
in the editor – how is our hypothetical static analysis tooling meant to represent that code internally? It isn’t possible with our current architecture. And no, wrapping all the fields of our data structures in Option
s isn’t feasible, either. Moreover, to display squigglies (and to have remotely helpful error messages) we need to store location information, which is tedious. Imagine having to add a field span
to every single data structure!
The solution to some of these problems is to use Rowan, a library that lets us represent incomplete code. In fact, it represents all text losslessly, which makes it well-suited to tooling that runs in a user’s editor. rust-analyzer uses Rowan internally, which is based on Roslyn and lib/Syntax from Swift.
Using Rowan mandates that we use a lexer, which is a program that takes a string as an input and slices it up into little pieces – here’s an example relevant to Eldiro:
fn add x y => x + y
let result = add 5 5
might be lexed, or tokenised, as
FnKw ("fn")
Ident ("add")
Ident ("x")
Ident ("y")
FatArrow ("=>")
Ident ("x")
Plus ("+")
Ident ("y")
Eol ("\n")
LetKw ("let")
Ident ("result")
Equals ("=")
Ident ("add")
Number ("5")
Number ("5")
Note how, unlike the structured output of a parser, the output of the lexer is flat. Note also how all whitespace (apart from the newline) has been removed from the output. This is the traditional choice, as most languages don’t have to worry about whitespace. In our case, though, we want to represent the input fully so editor features that work closely with the text (such as automatic refactoring and ‘expand selection’) is easier. As such, our lexer will include all whitespace.
The next part of this series will be the start of a rewrite: we’ll begin by wiping the project clean,1 and go on to write a lexer and an error-resilient parser for fully nested mathematical expressions, complete with good error messages. No more rewrites or sloppy code. We’re doing this for real.
I know full rewrites are inadvisable, but from what little we’ve written (by my count Eldiro contains 1,219 source lines at this point) there is barely anything salvageable. Maybe
Val
, and the basic structure of the REPL? That’s easy enough to rewrite later, though. ↩︎