glindo/parsers
Primitives and Combinators
Parser functions are function that take some state and return a function that parses a string for that state.
All parsers are curried functions that return a
function that takes in the input state. For
convenience, the run()
method can be used to build the
state and run()
the function
Unless you are composing parsers or performing any
other activity, It is better to use the run()
method
with the curried function.
Parsers can either be run by themselves with a constructed ParserState(a)
,
or called using the run()
function which will build the ParserState(a)
for
you. The run()
function takes in the returned parser function and the
string to parse with and has the signature run(Parser(a), String)
Functions
pub fn bind(
parser: Parser(a),
fnc: fn(a) -> Parser(b),
) -> Parser(b)
This generic parser-combinator takes a parser and on a successful run, takes the result
of that parser, feeds it into a function that takes in a
and use it to construct a
parser of type b
.
Example
bind(str("Chicken"), fn(_) { chr_grab() })
|> run("chicken-farm")
// -> Ok(ParseResult(res: "-", rem: "farm", idx: 8))
bind(str("Chicken"), fn(_) { chr_grab() })
|> run("chicken-farm")
// -> Ok(ParseResult(res: "-", rem: "farm", idx: 8))
pub fn btwn(
fst: Parser(a),
mid: Parser(b),
lst: Parser(c),
) -> Parser(b)
This generic combinator takes in three parsers generic over type a
, b
, c
, and returns
a parser that separates the result of the middle parser, after the first and last parser
have successfully run. This parser returns a type Result(ParseResult(b), String)
Example
sat_pred(chr_grab(), fn(chr) { chr != "{" && chr != "}" })
|> btwn(str("{"), str("}"))
|> run("{JSON}-Value")
// -> Ok(ParseResult("JSON", "-Value", 6))
This function only fails when one of the parameter parser functions fail as it is dependent solely on the input parser
pub fn chc_of(parserlist: List(Parser(a))) -> Parser(a)
This combinator is designed to combine multiple parsers into one. It
takes in a list of parsers of generic type Parser(a)
and returns a parser
that runs them in order. If one fails, the next successful parser is chosen
to run. If no parser in the list succeeds, it returns an Error.
chc_of can only take in a list of parsers of the same type.
Example
chc_of([str("low"), chr("-"), str("hi")])
|> run("hi-five's for you")
// -> Ok(ParseResult(res: "hi", rem: "-five's for you", idx: 2))
chc_of([str("low"), chr("-"), str("hi")])
|> run("down low too slow")
// -> Error("Error: no suitable parser found")
pub fn chc_opt(parserlist: List(Parser(a))) -> Parser(a)
Tries each parser in turn, but unlike chc_of
then picks the one
that consumed the most input. If none succeed, returns an Error.
Useful for “longest‐match” disambiguation when two parsers both succeed but one should win because it reads further.
Example
let p1 = str("foo")
let p2 = str("foobar")
|> chc_opt([p1, p2])
run(p, "foobarbaz")
// -> Ok(ParseResult(res: "foobar", rem: "baz", idx: 6))
Returns a Parser(a)
which on success has parsed as far as possible.
pub fn chr(pattern: String) -> Parser(String)
Parses for the specified character as first character in a string
and returns a Result(ParseResult(String), String)
.
Example
run(chr_grab(c), "character")
// -> Ok(ParseResult(res: "c", rem: "haracter", idx: 1))
run(chr(x), "dogs are cool")
// -> Error("Error: did not find 'x' at 'dogs are cool")
pub fn chr_grab() -> Parser(String)
Parses for the first character in a string and returns a
Result(ParseResult(String), String)
.
Example
run(chr_grab(), "character")
// -> Ok(ParseResult(res: "c", rem: "haracter", idx: 1))
run(chr_grab(), "")
// -> Error("Error: expected char, found none")
pub fn dgt(digit: Int) -> Parser(Int)
Parses a string for the first character as a single digit and
Returns a Result(ParseResult(Int), String)
.
Example
run(dgt(7), "7 is a prime number")
// -> Ok(res: 7, rem: " is a prime number", idx: 1)
run(dgt(6), "There is no 6 here")
// -> Error("Error: expected '6' found 'T'")
pub fn lazy(thunk: fn() -> Parser(a)) -> Parser(a)
Defers construction of a parser until parse time, allowing you to write recursive or mutually-recursive grammars without forward declaration errors.
Example
// A very simple recursive “nesting” grammar
let rec nested() = lazy(fn() {
btwn(chr("("), nested(), chr(")"))
})
run(nested(), "((()))")
// -> Ok(ParseResult(res: "()", rem: ")", idx: 4)) // can now handle recursion
Returns a Parser(a)
that, when run, calls your thunk to get the real parser.
pub fn map(parser: Parser(a), fnc: fn(a) -> b) -> Parser(b)
This combinator is designed to transform parsers. The parser combinator
takes in a parser of type Parser(a)
and a function that transforms a
to b
to then return a parser of type Parser(b)
. This can be used to transform one
function to another using the transformation function as some “bridge” for the
computation.
Example
map(num(), fn(number) { int.to_base16(number) })
|> run("2024 was wild ngl")
// -> Ok(ParseResult(res: 7E8, rem: " was wild ngl", idx: 4))
map(chr_grab, fn(char) { string.to_utf_codepoints(char) })
|> run("2024 was wild ngl")
// -> Ok(ParseResult(res: 7E8, rem: " was wild ngl", idx: 4))
pub fn mny_chc(parserlist: List(Parser(a))) -> Parser(List(a))
Repeats “choice” among the given parsers zero or more times, collecting each successful result into a list. Always succeeds (even if no parser ever matches) and returns the list of all matches in order.
Example
// Parses any number of “a” or “b” in any order
let p = mny_chc([chr("a"), chr("b")])
run(p, "abbaacxyz")
// -> Ok(ParseResult(res: ["a","b","b","a","a","c"], rem: "xyz", idx: 6))
Returns a Parser(List(a))
with all values parsed in sequence.
pub fn mny_of(parser: Parser(a)) -> Parser(List(a))
Repeats runs a parser on a string until it “fails”. mny_of
always succeeds
and always returns a ParseResult(List(a))
.
It is generic and can take in any parser kind.
Example
mny_of(chr(a))
|> run("aadvarks are cool")
// -> ParseResult(res: ["a", "a"], rem: "varks are cool", idx: 2)
mny_of(chr(a))
|> run("bojack is a bad horse")
// -> ParseResult(res: [], rem: "bojack is a bad horse", idx: 0)
pub fn num() -> Parser(Int)
Parses numbers at the start of a string and returns a
Result(ParseResult(Int), String)
.
Examples
run(num(), "123abc")
// -> Ok(ParseResult(res: 123, rem: "abc", idx: 3))
run(num(), "abc123")
// -> Error("Error: no number captured")
pub fn opt_of(parser: Parser(a)) -> Parser(Option(a))
Takes in a parser of type Parser(a), runs the parser, and returns a
ParseResult(Option(a))` of the parser. opt_of always succeeds and
cannot return an Error type.
Example
opt_of(p.num())
|> run("1500 SAT isn't bad lmao")
// -> Ok(ParseResult(res: Some(1500), rem: " SAT isn't bad lmao", idx: 4))
opt_of(p.num())
|> run("but a 600 is crazy")
// -> Ok(ParseResult(res: None, rem: "but a 600 is crazy", idx: 0))
pub fn peek_fwd(parser: Parser(a)) -> Parser(a)
Takes in a parser and looks ahead for the result of the successfully
ran parser value and returns a Result(ParseResult(a), String)
.
It is generic and can take in any Parser(a)
kind.
Example
peek_fwd(str("goat cheese"))
|> run("goat cheese tastes good")
// -> Ok(ParseResult("goat cheese", "goat cheese tastes good", 0))
peek_fwd(str("goat cheese"))
|> run("Bungee gum is somewhat rubber")
// -> Error("Error: given string does not start with 'rubber'")
pub fn prefix_str(pattern: String) -> Parser(String)
Parses a string for a given substring at the start of the string and
returns a Result(ParseResult(String), String)
.
Example
run(prefix_str("race"), "racecars are cool")
// -> Ok(ParseResult(res: "race", rem: "cars are cool", idx: 4))
run(prefix_str(spoon), "grasshoppers suck")
// -> Error("Error: given string does not start with 'spoon'")
pub fn print_array_string(list: List(String)) -> Nil
Parser(List(String))
helper function
Prints out a list of strings
This function is intended to be used alongside string parsers and string
parsers only.
Example
string_to_int(["random", "list", "of", "strings"])
// -> random, list, of, strings
pub fn sat_pred(
parser: Parser(a),
fnc: fn(a) -> Bool,
) -> Parser(a)
This generic combinator over type a
takes a parser of type Parser(a)
and a boolean
function. The parser result will only be returned for values that satisfy the predicate.
Example
seq_of([chr("b"), chr("o"), chr("b")])
|> map(fn(list_chr){ string.concat(list_chr) })
|> sat_pred(fn(chr) { chr == "bob" })
|> run("bobbit is nice")
// -> Ok(ParseResult(res: "bob", rem: "bit is nice", idx: 3))
sat_pred(num(), fn(num) { num < 5 })
|> run("6 cold cans of soda")
// -> Error("Error: unsatisfied predicate")
pub fn sep_by(item: Parser(a), sep: Parser(b)) -> Parser(List(a))
This combinator takes in two parsers generic over type a
and b
. Parser(a)
parses a string for its desired result, and Parser(b)
, if successfull, is
skipped and returns a Ok(ParseResult(List(a)))
. This parser always succeeds
and always returns a Ok(ParseResult(List(a)))
.
This is generic over a
and b
.
Example
sat_pred(chr_grab, fn(chr) { chr != "," })
|> sep(str(","))
|> run("eggs, bacon, and more")
// -> Ok(ParseResult(res: ["eggs", " bacon", " and more"], rem: "", idx: 21))
sat_pred(chr_grab, fn(chr) { chr != "," })
|> sep(str(","))
|> run("No seperator yet")
// -> Ok(ParseResult(res: ["No seperator yet"], rem: "", idx: 16))
pub fn seq_of(parserlist: List(Parser(a))) -> Parser(List(a))
This combinator takes a list of parsers of the same type, runs each parser
sequentially, and returns a parser that parses a string for the result of the
sequential parser. If one parser in the sequence fails, the entire parser fails.
The resulting parser returns a Result(ParseResult(a), String)
Example
seq_of([str(hi), chr(-), str(five)])
|> run("hi-five's for the good job")
// Ok(ParseResult(res: ["hi", "-", "five"], rem: "'s for the good job", idx: 7))
seq_of([str("nuh-uh"), chr(" "), str("bro")])
|> run("What do you mean by that?")
// -> Error("Error: could not match parser sequence")
pub fn skip(parser1: Parser(a), parser2: Parser(b)) -> Parser(b)
This combinator takes in two parsers generic over type a
and b
respectively.
This returns a parser that runs the first parser. if successful, the result of the
first parser is ignored and the second parser is run. This function returns a
function of return type Result(ParseResult(b), String)
.
Example
skip(str("yo"), str("gurt"))
|> run("yogurt, what's up dude")
// -> Ok(ParseResult(res: "gurt", rem: ", what's up dude", idx: 6))
skip(str("yo"), str("gurt"))
|> run("hey gurt, what's up dude")
// -> Error("Error: given string does not start with 'yo'")
pub fn str(pattern: String) -> Parser(String)
Parses a string for a given substring at the start of the string and
returns a Result(ParseResult(String), String)
.
Example
run(str("race"), "racecars are cool")
// -> Ok(ParseResult(res: "race", rem: "cars are cool", idx: 4))
run(str(spoon), "grasshoppers suck")
// -> Error("Error: given string does not start with 'spoon'")
pub fn string_to_int(str: String) -> Int
num()
& dgt(int)
helper function
Converts a string to an Int.
This function is intended to be used alongside num parsers and num parsers only.
Example
string_to_int("247")
// -> 247
pub fn tok(parser: Parser(a)) -> Parser(a)
Parses a token but skips any surrounding whitespace. It will:
- Skip leading whitespace (
wht_space
) - Run your parser
- Skip trailing whitespace (
wht_space
)
and then return the parser’s result.
Example
let p = tok(str("let"))
|> bind(fn(_) { chr("=") })
run(p, " let =x")
// -> Ok(ParseResult(res: "=", rem: "x", idx: 8))
Returns a Parser(a)
that parses a
with optional padding.
pub fn wht_space() -> Parser(String)
Parses single white space “ “ and “\t” white-space characters
at the beginning of a string and returns a Result(ParseResult(String), String)
When there is no white space character, wht_spc simply returns
and Ok(ParseResult(String))
with no updates to the original ParseState variables.
Examples
run(wht_space(), "\tstarting white-space")
// -> Ok(ParseResult(res: "\t", rem: "starting white-space", idx: 1))
run(wht_space(), "no white-space")
// -> Ok(ParseResult(res: "", rem: "no white-space", idx: 0))