return to main site

Part Four: Backtracking

In this part of the series we’ll start with the goal of allowing Exprs to be numbers. In case you’ve forgotten, at the moment Exprs have to be mathematical operations, making simple things like let x = 5 impossible.

Let’s start by hopping over to Expr’s definition in expr.rs, and changing it to be an enum that can hold either a mathematical operation, or a number:

#[derive(Debug, PartialEq)]
pub enum Expr {
    Number(Number),
    Operation { lhs: Number, rhs: Number, op: Op },
}

We’ve got errors appearing all over the place; let’s fix parsing first. Now, when we’re parsing an Expr, we have multiple possibilities – we need to figure out whether we are parsing a number or an operation.

Here’s the current (broken) code for parsing an Expr:

impl Expr {
    pub fn new(s: &str) -> (&str, Self) {
        let (s, lhs) = Number::new(s);
        let (s, _) = utils::extract_whitespace(s);

        let (s, op) = Op::new(s);
        let (s, _) = utils::extract_whitespace(s);

        let (s, rhs) = Number::new(s);

        (s, Self { lhs, rhs, op })
    }

    // snip
}

We first parse a number, then whitespace, then an operator, then whitespace, and finally another number. If the input to the parser is just a number, the point at which it’ll fail is when it tries to parse the operator, because s by then will be an empty string. Note that it won’t fail to parse the whitespace after the number no matter what because utils::extract_whitespace allows an absence of whitespace.

Let’s take a look at Op::new to see how it’ll fail:

impl Op {
    pub fn new(s: &str) -> (&str, Self) {
        let (s, op) = utils::extract_op(s);

        let op = match op {
            "+" => Self::Add,
            "-" => Self::Sub,
            "*" => Self::Mul,
            "/" => Self::Div,
            _ => unreachable!(),
        };

        (s, op)
    }
}

I wonder what happens if utils::extract_op is given an empty string (as s will be if we’re trying to parse a number with Expr::new)?

// utils.rs

pub(crate) fn extract_op(s: &str) -> (&str, &str) {
    match &s[0..1] {
        "+" | "-" | "*" | "/" => {}
        _ => panic!("bad operator"),
    }

    (&s[1..], &s[0..1])
}

Ahah! Given an empty string, utils::extract_op will try to index one byte into the (empty) input – as seen in &s[0..1] – which is out of bounds, causing a panic. As you might recall from The Book, panic is for unrecoverable errors. Out of bounds indexing is indeed unrecoverable, but trying to parse an operation when you actually have a number? That’s recoverable.

The parser should be able to recover from the error and realise that, if the operator is not present, we should try parsing a number before giving up. What we need is some way to try a parser, and, if it fails, backtrack and try the next possibility. Three features of Rust and its standard library make this trivial to implement:

We’ll start by changing all our parser tests to expect values wrapped in Ok. While we’re doing this, we might as well replace all references to Expr in our tests with Expr::Operation to account for the changes we made to Expr earlier. Here’s binding_def.rs’s new test suite:

#[cfg(test)]
mod tests {
    use super::*;
    use crate::expr::{Number, Op};

    #[test]
    fn parse_binding_def() {
        assert_eq!(
            BindingDef::new("let a = 10 / 2"),
            Ok((
                "",
                BindingDef {
                    name: "a".to_string(),
                    val: Expr::Operation {
                        lhs: Number(10),
                        rhs: Number(2),
                        op: Op::Div,
                    },
                },
            )),
        );
    }
}

Here are expr.rs’s tests:

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn parse_number() {
        assert_eq!(Number::new("123"), Ok(("", Number(123))));
    }

    #[test]
    fn parse_add_op() {
        assert_eq!(Op::new("+"), Ok(("", Op::Add)));
    }

    #[test]
    fn parse_sub_op() {
        assert_eq!(Op::new("-"), Ok(("", Op::Sub)));
    }

    #[test]
    fn parse_mul_op() {
        assert_eq!(Op::new("*"), Ok(("", Op::Mul)));
    }

    #[test]
    fn parse_div_op() {
        assert_eq!(Op::new("/"), Ok(("", Op::Div)));
    }

    #[test]
    fn parse_one_plus_two() {
        assert_eq!(
            Expr::new("1+2"),
            Ok((
                "",
                Expr::Operation {
                    lhs: Number(1),
                    rhs: Number(2),
                    op: Op::Add,
                },
            )),
        );
    }

    #[test]
    fn parse_expr_with_whitespace() {
        assert_eq!(
            Expr::new("2 * 2"),
            Ok((
                "",
                Expr::Operation {
                    lhs: Number(2),
                    rhs: Number(2),
                    op: Op::Mul,
                },
            )),
        );
    }

    #[test]
    fn eval_add() {
        assert_eq!(
            Expr::Operation {
                lhs: Number(10),
                rhs: Number(10),
                op: Op::Add,
            }
            .eval(),
            Val::Number(20),
        );
    }

    #[test]
    fn eval_sub() {
        assert_eq!(
            Expr::Operation {
                lhs: Number(1),
                rhs: Number(5),
                op: Op::Sub,
            }
            .eval(),
            Val::Number(-4),
        );
    }

    #[test]
    fn eval_mul() {
        assert_eq!(
            Expr::Operation {
                lhs: Number(5),
                rhs: Number(6),
                op: Op::Mul,
            }
            .eval(),
            Val::Number(30),
        );
    }

    #[test]
    fn eval_div() {
        assert_eq!(
            Expr::Operation {
                lhs: Number(200),
                rhs: Number(20),
                op: Op::Div,
            }
            .eval(),
            Val::Number(10),
        );
    }
}

All our extractors from utils.rs cannot fail , so we don’t need to update their tests. (This isn’t true, but just play along for the moment. I’m trying to make this follow my actual workflow when writing code, mistakes and all!) tag, though, can fail, so we need to update its test:

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn tag_word() {
        assert_eq!(tag("let", "let a"), Ok(" a"));
    }
}

Each and every single one of those test changes is another compilation error to add onto the pile – we better start fixing them!

Arbitrarily, I have chosen to start at tag, since I still have utils.rs open from a moment ago. Here’s tag’s source to remind you:

pub(crate) fn tag<'a, 'b>(starting_text: &'a str, s: &'b str) -> &'b str {
    if s.starts_with(starting_text) {
        &s[starting_text.len()..]
    } else {
        panic!("expected {}", starting_text);
    }
}

This fix is easy! Rather than panicking, we’ll return Result’s Err variant:

pub(crate) fn tag<'a, 'b>(starting_text: &'a str, s: &'b str) -> &'b str {
    if s.starts_with(starting_text) {
        &s[starting_text.len()..]
    } else {
        Err(format!("expected {}", starting_text))
    }
}

We still need to wrap the happy path in Ok and change the return type:

pub(crate) fn tag<'a, 'b>(starting_text: &'a str, s: &'b str) -> Result<&'b str, String> {
    if s.starts_with(starting_text) {
        Ok(&s[starting_text.len()..])
    } else {
        Err(format!("expected {}", starting_text))
    }
}

Nice! Let’s move on to Expr::new. Again, here’s the current definition:

impl Expr {
    pub fn new(s: &str) -> (&str, Self) {
        let (s, lhs) = Number::new(s);
        let (s, _) = utils::extract_whitespace(s);

        let (s, op) = Op::new(s);
        let (s, _) = utils::extract_whitespace(s);

        let (s, rhs) = Number::new(s);

        (s, Self { lhs, rhs, op })
    }

    // snip
}

Update the return type and wrap the final expression in Ok:

impl Expr {
    pub fn new(s: &str) -> Result<(&str, Self), String> {
        let (s, lhs) = Number::new(s);
        let (s, _) = utils::extract_whitespace(s);

        let (s, op) = Op::new(s);
        let (s, _) = utils::extract_whitespace(s);

        let (s, rhs) = Number::new(s);

        Ok((s, Self { lhs, rhs, op }))
    }

    // snip
}

Notice that (apart from the compile errors introduced by changing Expr from only representing operations to also representing numbers) this compiles, without us having to use ? anywhere. This means that this function doesn’t return an Err anywhere. That’s definitely incorrect. After all, isn’t parsing (for example) an Op meant to return an error if the input does not begin with an operator?

We need to keep moving, though, so let’s go onwards to Op. Here’s the current version:

impl Op {
    pub fn new(s: &str) -> (&str, Self) {
        let (s, op) = utils::extract_op(s);

        let op = match op {
            "+" => Self::Add,
            "-" => Self::Sub,
            "*" => Self::Mul,
            "/" => Self::Div,
            _ => unreachable!(),
        };

        (s, op)
    }
}

We could do the same thing to Op::new as what we just did to Expr::new, but that seemed wrong. Let’s think about how we want Op::new to behave: if it is given an input that doesn’t start with an operator, it should return Err; otherwise, it should return an Op wrapped in Ok.

Where in this function is the case where it has encountered a non-operator? It’s not on the line with unreachable!, as that case is impossible (utils::extract_op only returns "+". "-", "*" or "/"). Hang on a second. What did I say?

utils::extract_op only returns "+". "-", "*" or "/"

That means that utils::extract_op must handle the error case! And indeed, if we look at utils::extract_op, we see the offending panic!:

pub(crate) fn extract_op(s: &str) -> (&str, &str) {
    match &s[0..1] {
        "+" | "-" | "*" | "/" => {}
        _ => panic!("bad operator"),
    }

    (&s[1..], &s[0..1])
}

Now, we could modify this in the same way we did utils::tag, but that’s unnecessary: utils::tag can perform the same task that extract_op does. We can use that or_else method from earlier to try tagging "+", and if that fails, try the next alternative, and so on. Let’s get rid of utils::extract_op and all its tests, in favour of rewriting Op::new to use utils::tag instead:

impl Op {
    pub fn new(s: &str) -> Result<(&str, Self), String> {
        utils::tag("+", s)
            .map(|s| (s, Self::Add))
            .or_else(|_| utils::tag("-", s).map(|s| (s, Self::Sub)))
            .or_else(|_| utils::tag("*", s).map(|s| (s, Self::Mul)))
            .or_else(|_| utils::tag("/", s).map(|s| (s, Self::Div)))
    }
}

Cool! Next up is Number::new:

impl Number {
    pub fn new(s: &str) -> (&str, Self) {
        let (s, number) = utils::extract_digits(s);
        (s, Self(number.parse().unwrap()))
    }
}

This is different to Op::new, in that there is no error case.2 If we look at utils::extract_digits, we see that it uses our take_while convenience function:

pub(crate) fn extract_digits(s: &str) -> (&str, &str) {
    take_while(|c| c.is_ascii_digit(), s)
}

The problem with the usage of take_while in this case is that it is completely happy with an input that is of zero length. For example, if we call extract_digits("hello"), we get back ("hello", ""). Although this may be what we want in some cases (e.g. whitespace is often optional, so extract_whitespace not returning an error when it fails to extract any whitespace makes sense), it isn’t what we want in this case. Let’s write a wrapper around take_while that returns an error if the extracted portion is empty:3

pub(crate) fn take_while1(
    accept: impl Fn(char) -> bool,
    s: &str,
    error_msg: String,
) -> Result<(&str, &str), String> {
    let (remainder, extracted) = take_while(accept, s);

    if extracted.is_empty() {
        Err(error_msg)
    } else {
        Ok((remainder, extracted))
    }
}

Let’s use this in extract_digits:

pub(crate) fn extract_digits(s: &str) -> Result<(&str, &str), String> {
    take_while1(|c| c.is_ascii_digit(), s, "expected digits".to_string())
}

To whittle down those compiler errors, let’s Ok-wrap the expected values of all the extract_digits tests we have:

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn extract_one_digit() {
        assert_eq!(extract_digits("1+2"), Ok(("+2", "1")));
    }

    #[test]
    fn extract_multiple_digits() {
        assert_eq!(extract_digits("10-20"), Ok(("-20", "10")));
    }

    #[test]
    fn do_not_extract_anything_from_empty_input() {
        assert_eq!(extract_digits(""), Ok(("", "")));
    }

    #[test]
    fn extract_digits_with_no_remainder() {
        assert_eq!(extract_digits("100"), Ok(("", "100")));
    }

    // snip
}

We don’t want that do_not_extract_anything_from_empty_input test any more, so let’s replace it with a test that verifies that extract_digits errors out on input that doesn’t start with a digit:

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn do_not_extract_digits_when_input_is_invalid() {
        assert_eq!(extract_digits("abcd"), Err("expected digits".to_string()));
    }

    // snip
}

After all that we’re finally ready to fix the implementation of Number::new:

impl Number {
    pub fn new(s: &str) -> Result<(&str, Self), String> {
        let (s, number) = utils::extract_digits(s)?;
        Ok((s, Self(number.parse().unwrap())))
    }
}

Now that all (well, not all – you’ll see) the basic extractors and parsers that can fail have been explicitly marked as such (by returning Result), we can fix the two main sources of compiler errors left: Expr::new, and BindingDef::new.

Starting with BindingDef::new, let’s remind ourselves of the present implementation:

impl BindingDef {
    pub fn new(s: &str) -> (&str, Self) {
        let s = utils::tag("let", s);
        let (s, _) = utils::extract_whitespace(s);

        let (s, name) = utils::extract_ident(s);
        let (s, _) = utils::extract_whitespace(s);

        let s = utils::tag("=", s);
        let (s, _) = utils::extract_whitespace(s);

        let (s, val) = Expr::new(s);

        (
            s,
            Self {
                name: name.to_string(),
                val,
            },
        )
    }

    // snip
}

Fixing this is trivial: all we need to do is sprinkle in some ?s, and add an Ok and a Result.

impl BindingDef {
    pub fn new(s: &str) -> Result<(&str, Self), String> {
        let s = utils::tag("let", s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, name) = utils::extract_ident(s);
        let (s, _) = utils::extract_whitespace(s);

        let s = utils::tag("=", s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, val) = Expr::new(s)?;

        Ok((
            s,
            Self {
                name: name.to_string(),
                val,
            },
        ))
    }

    // snip
}

Now for the big one: Expr::new. We’ll need to put all our newfound knowledge to use, and implement backtracking. Once again, here’s what we’re starting from:

impl Expr {
    pub fn new(s: &str) -> Result<(&str, Self), String> {
        let (s, lhs) = Number::new(s);
        let (s, _) = utils::extract_whitespace(s);

        let (s, op) = Op::new(s);
        let (s, _) = utils::extract_whitespace(s);

        let (s, rhs) = Number::new(s);

        Ok((s, Self { lhs, rhs, op }))
    }

    // snip
}

The first and most obvious mistake is the lack of ?s in all the places where parsers or extractors can fail. Let’s add those:

impl Expr {
    pub fn new(s: &str) -> Result<(&str, Self), String> {
        let (s, lhs) = Number::new(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, op) = Op::new(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, rhs) = Number::new(s)?;

        Ok((s, Self { lhs, rhs, op }))
    }

    // snip
}

The next issue is how we’re treating Self like it was before we made it into an enum:

impl Expr {
    pub fn new(s: &str) -> Result<(&str, Self), String> {
        let (s, lhs) = Number::new(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, op) = Op::new(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, rhs) = Number::new(s)?;

        //      Here ↓
        Ok((s, Self::Operation { lhs, rhs, op }))
    }

    // snip
}

Although it compiles now, Expr::new doesn’t handle the case where the input is just a number, not an operation. Thanks to our use of Result everywhere, this will be easy to implement. First, though, we should add a test:

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn parse_number_as_expr() {
        assert_eq!(Expr::new("456"), Ok(("", Expr::Number(Number(456)))));
    }

    // snip
}

Sadly, Eldiro is still not compiling. We’ll violate the laws of TDD (as I have been doing so flagrantly throughout this part) and write the implementation needed to make this test pass, rather than fixing what’s been broken all this time. Let’s start by separating out all the code needed to parse an Expr::Operation into its own method:

impl Expr {
    pub fn new(s: &str) -> Result<(&str, Self), String> {
        Self::new_operation(s)
    }

    fn new_operation(s: &str) -> Result<(&str, Self), String> {
        let (s, lhs) = Number::new(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, op) = Op::new(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, rhs) = Number::new(s)?;

        Ok((s, Self::Operation { lhs, rhs, op }))
    }

    // snip
}

Next, we need to add a method for parsing a number:

impl Expr {
    // snip

    fn new_number(s: &str) -> Result<(&str, Self), String> {
        Number::new(s).map(|(s, number)| (s, Self::Number(number)))
    }

    // snip
}

And, finally, to bring it all together, let’s add the magic line to Expr::new so that it’ll first try parsing an operation, and if that fails try parsing a number:

impl Expr {
    pub fn new(s: &str) -> Result<(&str, Self), String> {
        Self::new_operation(s).or_else(|_| Self::new_number(s))
    }

    // snip
}

Phew! It’s finally done. To see if we’ve done it correctly, we need to run our tests. To run our tests, we need Eldiro to compile. The only remaining source of compilation errors left is Expr::eval, which is a holdover from before Expr was an enum. This is easy to fix:

impl Expr {
    // snip

    pub(crate) fn eval(&self) -> Val {
        match self {
            Self::Number(Number(n)) => Val::Number(*n),
            Self::Operation { lhs, rhs, op } => {
                let Number(lhs) = lhs;
                let Number(rhs) = rhs;

                let result = match op {
                    Op::Add => lhs + rhs,
                    Op::Sub => lhs - rhs,
                    Op::Mul => lhs * rhs,
                    Op::Div => lhs / rhs,
                };

                Val::Number(result)
            }
        }
    }
}

The moment of truth has arrived!

$ cargo t
   Compiling eldiro v0.1.0 (/home/me/src/eldiro)
warning: associated function is never used: `eval`
  --> src/binding_def.rs:33:19
   |
33 |     pub(crate) fn eval(&self, env: &mut Env) {
   |                   ^^^^
   |
   = note: `#[warn(dead_code)]` on by default

warning: associated function is never used: `eval`
  --> src/expr.rs:59:19
   |
59 |     pub(crate) fn eval(&self) -> Val {
   |                   ^^^^

warning: associated function is never used: `store_binding`
  --> src/env.rs:10:19
   |
10 |     pub(crate) fn store_binding(&mut self, name: String, val: Val) {
   |                   ^^^^^^^^^^^^^

warning: 3 warnings emitted

warning: associated function is never used: `eval`
  --> src/binding_def.rs:33:19
   |
33 |     pub(crate) fn eval(&self, env: &mut Env) {
   |                   ^^^^
   |
   = note: `#[warn(dead_code)]` on by default

warning: associated function is never used: `store_binding`
  --> src/env.rs:10:19
   |
10 |     pub(crate) fn store_binding(&mut self, name: String, val: Val) {
   |                   ^^^^^^^^^^^^^

warning: 2 warnings emitted

    Finished test [unoptimized + debuginfo] target(s) in 1.10s
     Running /home/me/.cache/cargo-target/debug/deps/eldiro-4b82c3c57a78933f

running 22 tests
test expr::tests::eval_add ... ok
test binding_def::tests::parse_binding_def ... ok
test expr::tests::eval_div ... ok
test expr::tests::eval_mul ... ok
test expr::tests::eval_sub ... ok
test expr::tests::parse_add_op ... ok
test expr::tests::parse_div_op ... ok
test expr::tests::parse_expr_with_whitespace ... ok
test expr::tests::parse_mul_op ... ok
test expr::tests::parse_number ... ok
test expr::tests::parse_number_as_expr ... ok
test expr::tests::parse_sub_op ... ok
test expr::tests::parse_one_plus_two ... ok
test utils::tests::cannot_extract_ident_beginning_with_number ... ok
test utils::tests::extract_alphabetic_ident ... ok
test utils::tests::do_not_extract_digits_when_input_is_invalid ... ok
test utils::tests::extract_alphanumeric_ident ... ok
test utils::tests::extract_digits_with_no_remainder ... ok
test utils::tests::extract_multiple_digits ... ok
test utils::tests::extract_one_digit ... ok
test utils::tests::extract_spaces ... ok
test utils::tests::tag_word ... ok

test result: ok. 22 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Woohoo!

I lied

It’s true. All along in this series, I have not mentioned several serious bugs in our parser, hoping no-one would notice. However, amosonn noticed and reported the bug to me. If only they hadn’t seen! Now that someone’s mentioned it, though, I am obliged to fix it. In keeping with this theme of redemption, hopefully this section will be more test-driven and less messy than what we just went through.

Here’s the bug: the input letaaa=1+2 is parsed by BindingDef::new, without any errors whatsoever. The problem is that the usage of utils::extract_whitespace in BindingDef::new should require whitespace after the let keyword, when currently utils::extract_whitespace accepts inputs that don’t begin with whitespace. What’s more, let = 1+2 is also parsed without issue. This stems from a similar problem, this time with utils::extract_ident.

Let’s fix these issues, starting with the second. Take a look at the test titled cannot_extract_ident_beginning_with_number in utils.rs, as it is the scenario we want to test:

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn cannot_extract_ident_beginning_with_number() {
        assert_eq!(extract_ident("123abc"), ("123abc", ""));
    }

    // snip
}

Instead of just not consuming any input, we want to return an error:

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn cannot_extract_ident_beginning_with_number() {
        assert_eq!(
            extract_ident("123abc"),
            Err("expected identifier".to_string()),
        );
    }

    // snip
}

Let’s examine the definition of extract_ident so we can get an idea of how to solve this issue:

pub(crate) fn extract_ident(s: &str) -> (&str, &str) {
    let input_starts_with_alphabetic = s
        .chars()
        .next()
        .map(|c| c.is_ascii_alphabetic())
        .unwrap_or(false);

    if input_starts_with_alphabetic {
        take_while(|c| c.is_ascii_alphanumeric(), s)
    } else {
        (s, "")
    }
}

Ah, so what we want to do is, instead of returning the entire input as leftover if the input doesn’t start with an alphabetic character, we should return an error message:

pub(crate) fn extract_ident(s: &str) -> (&str, &str) {
    let input_starts_with_alphabetic = s
        .chars()
        .next()
        .map(|c| c.is_ascii_alphabetic())
        .unwrap_or(false);

    if input_starts_with_alphabetic {
        take_while(|c| c.is_ascii_alphanumeric(), s)
    } else {
        Err("expected identifier".to_string())
    }
}

We still need to change the other case of the if expression and the return type, though:

pub(crate) fn extract_ident(s: &str) -> Result<(&str, &str), String> {
    let input_starts_with_alphabetic = s
        .chars()
        .next()
        .map(|c| c.is_ascii_alphabetic())
        .unwrap_or(false);

    if input_starts_with_alphabetic {
        Ok(take_while(|c| c.is_ascii_alphanumeric(), s))
    } else {
        Err("expected identifier".to_string())
    }
}

Now extract_ident compiles, but its tests don’t! We need to Ok-wrap them:

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn extract_alphabetic_ident() {
        assert_eq!(extract_ident("abcdEFG stop"), Ok((" stop", "abcdEFG")));
    }

    #[test]
    fn extract_alphanumeric_ident() {
        assert_eq!(extract_ident("foobar1()"), Ok(("()", "foobar1")));
    }

    // snip
}

The last compilation error left is in BindingDef::new, where we need to add a ?:

impl BindingDef {
    pub fn new(s: &str) -> Result<(&str, Self), String> {
        let s = utils::tag("let", s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, name) = utils::extract_ident(s)?; // here
        let (s, _) = utils::extract_whitespace(s);

        let s = utils::tag("=", s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, val) = Expr::new(s)?;

        Ok((
            s,
            Self {
                name: name.to_string(),
                val,
            },
        ))
    }

    // snip
}

Let’s see if it works:

$ cargo t -q
running 22 tests
......................
test result: ok. 22 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

We’re almost done, now – all that’s left is fixing the handling of whitespace.

Here’s the definition of utils::extract_whitespace:

pub(crate) fn extract_whitespace(s: &str) -> (&str, &str) {
    take_while(|c| c == ' ', s)
}

At first you might think that the fix for this is as simple as swapping take_while for take_while1. This will lead to other issues, though, as the = in let a = 10 doesn’t need spaces around it, nor does the + in 3 + 4. We need a separate extractor for when whitespace is required, rather than optional. Let’s write a test that represents the behaviour we’d like:

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn do_not_extract_spaces1_when_input_does_not_start_with_them() {
        assert_eq!(
            extract_whitespace1("blah"),
            Err("expected a space".to_string()),
        );
    }

    // snip
}

The implementation that makes this pass is as simple as this:

pub(crate) fn extract_whitespace1(s: &str) -> Result<(&str, &str), String> {
    take_while1(|c| c == ' ', s, "expected a space".to_string())
}

All that’s left is to use it in BindingDef::new (the only parser so far where spaces are definitely required):

impl BindingDef {
    pub fn new(s: &str) -> Result<(&str, Self), String> {
        let s = utils::tag("let", s)?;
        let (s, _) = utils::extract_whitespace1(s)?; // New!

        let (s, name) = utils::extract_ident(s)?;
        let (s, _) = utils::extract_whitespace(s);

        let s = utils::tag("=", s)?;
        let (s, _) = utils::extract_whitespace(s);

        let (s, val) = Expr::new(s)?;

        Ok((
            s,
            Self {
                name: name.to_string(),
                val,
            },
        ))
    }

    // snip
}

Notice how utils::extract_whitespace1 can fail, while utils::extract_whitespace cannot.

To be sure we’ve fixed the issue, let’s add a test:

#[cfg(test)]
mod tests {
    // snip

    #[test]
    fn cannot_parse_binding_def_without_space_after_let() {
        assert_eq!(
            BindingDef::new("letaaa=1+2"),
            Err("expected a space".to_string()),
        );
    }
}

And …

$ cargo t -q
running 24 tests
........................
test result: ok. 24 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

We’re done! See you all next time, when we’ll add support for using the bindings we can create with let.


  1. ? can have its behaviour customised for any type on nightly as of the time of writing. In fact, ? works with Option on stable Rust today! ↩︎

  2. That .unwrap() might look suspicious, but it’s actually never going to trigger because the string that utils::extract_digits returns always consists entirely of digits, meaning that parsing of that string as an integer will never fail. ↩︎

  3. The name for take_while1 is meant as ‘take_while, with the extracted text being at least one byte long’. As with tag and take_while, the name comes from Nom↩︎