Last time, we made a parser for simple unnested mathematical expressions, such as 1+1
or 3*4
. In this post, we’ll add support for whitespace (so that users of Eldiro will be able to use 2 + 2
instead of 2+2
).
We can achieve this by creating an extract_whitespace
function similar to extract_digits
:
// utils.rs
// Let’s copy-paste from extract_digits
pub(crate) fn extract_whitespace(s: &str) -> (&str, &str) {
let whitespace_end = s
.char_indices()
.find_map(|(idx, c)| if c == ' ' { None } else { Some(idx) })
.unwrap_or_else(|| s.len());
let whitespace = &s[..whitespace_end];
let remainder = &s[whitespace_end..];
(remainder, whitespace)
}
#[cfg(test)]
mod tests {
// snip
#[test]
fn extract_spaces() {
assert_eq!(extract_whitespace(" 1"), ("1", " "));
}
}
Although this does indeed work, it involves quite a bit of repetition. What if we want a similar function for, say, extracting variable names? Instead of copy-pasting extract_digits
yet again, we can define a take_while
function with the following signature:
fn take_while(accept: impl Fn(char) -> bool, s: &str) -> (&str, &str)
take_while
takes a function called accept
that determines whether to accept a given character or not. Once it encounters a character that is not accepted, it stops consuming characters and returns.
pub(crate) fn take_while(accept: impl Fn(char) -> bool, s: &str) -> (&str, &str) {
let extracted_end = s
.char_indices()
.find_map(|(idx, c)| if accept(c) { None } else { Some(idx) })
.unwrap_or_else(|| s.len());
let extracted = &s[..extracted_end];
let remainder = &s[extracted_end..];
(remainder, extracted)
}
We can now redefine extract_digits
and extract_whitespace
in terms of take_while
:
pub(crate) fn extract_digits(s: &str) -> (&str, &str) {
take_while(|c| c.is_ascii_digit(), s)
}
pub(crate) fn extract_whitespace(s: &str) -> (&str, &str) {
take_while(|c| c == ' ', s)
}
And our tests still pass!
To actually allow whitespace inside expressions, we can call extract_whitespace
while throwing away the output:
// lib.rs
impl Expr {
pub fn new(s: &str) -> (&str, Self) {
let (s, lhs) = Number::new(s);
let (s, _) = utils::extract_whitespace(s);
let (s, op) = Op::new(s);
let (s, _) = utils::extract_whitespace(s);
let (s, rhs) = Number::new(s);
(s, Self { lhs, rhs, op })
}
}
#[cfg(test)]
mod tests {
// snip
#[test]
fn parse_expr_with_whitespace() {
assert_eq!(
Expr::new("2 * 2"),
(
"",
Expr {
lhs: Number(2),
rhs: Number(2),
op: Op::Mul,
},
),
);
}
}
Note that we aren’t using let _ = utils::extract_whitespace(s)
, as that would throw away the s
that we get back, thereby not progressing through the input. Instead, by using let (s, _) = ...
we keep advancing, but throw away the actual whitespace we extracted.
In the next post, Eldiro will gain the ability to create variables. This involves both parsing, as well as the beginnings of an interpreter.