glindo/parsers

Primitives and Combinators

Parser functions are function that take some state and return a function that parses a string for that state.

All parsers are curried functions that return a function that takes in the input state. For convenience, the run() method can be used to build the state and run() the function

Unless you are composing parsers or performing any other activity, It is better to use the run() method with the curried function.

Parsers can either be run by themselves with a constructed ParserState(a), or called using the run() function which will build the ParserState(a) for you. The run() function takes in the returned parser function and the string to parse with and has the signature run(Parser(a), String)

Functions

pub fn bind(
  parser: Parser(a),
  fnc: fn(a) -> Parser(b),
) -> Parser(b)

This generic parser-combinator takes a parser and on a successful run, takes the result of that parser, feeds it into a function that takes in a and use it to construct a parser of type b.

Example

bind(str("Chicken"), fn(_) { chr_grab() })
|> run("chicken-farm")
// -> Ok(ParseResult(res: "-", rem: "farm", idx: 8))
bind(str("Chicken"), fn(_) { chr_grab() })
|> run("chicken-farm")
// -> Ok(ParseResult(res: "-", rem: "farm", idx: 8))
pub fn btwn(
  fst: Parser(a),
  mid: Parser(b),
  lst: Parser(c),
) -> Parser(b)

This generic combinator takes in three parsers generic over type a, b, c, and returns a parser that separates the result of the middle parser, after the first and last parser have successfully run. This parser returns a type Result(ParseResult(b), String)

Example

sat_pred(chr_grab(), fn(chr) { chr != "{" && chr != "}" })
|> btwn(str("{"), str("}"))
|> run("{JSON}-Value")
// -> Ok(ParseResult("JSON", "-Value", 6))

This function only fails when one of the parameter parser functions fail as it is dependent solely on the input parser

pub fn chc_of(parserlist: List(Parser(a))) -> Parser(a)

This combinator is designed to combine multiple parsers into one. It takes in a list of parsers of generic type Parser(a) and returns a parser that runs them in order. If one fails, the next successful parser is chosen to run. If no parser in the list succeeds, it returns an Error.

chc_of can only take in a list of parsers of the same type.

Example

chc_of([str("low"), chr("-"), str("hi")])
|> run("hi-five's for you")
// -> Ok(ParseResult(res: "hi", rem: "-five's for you", idx: 2))
chc_of([str("low"), chr("-"), str("hi")])
|> run("down low too slow")
// -> Error("Error: no suitable parser found")
pub fn chc_opt(parserlist: List(Parser(a))) -> Parser(a)

Tries each parser in turn, but unlike chc_of then picks the one that consumed the most input. If none succeed, returns an Error.

Useful for “longest‐match” disambiguation when two parsers both succeed but one should win because it reads further.

Example

let p1 = str("foo")
let p2 = str("foobar")
|> chc_opt([p1, p2])

run(p, "foobarbaz")
// -> Ok(ParseResult(res: "foobar", rem: "baz", idx: 6))

Returns a Parser(a) which on success has parsed as far as possible.

pub fn chr(pattern: String) -> Parser(String)

Parses for the specified character as first character in a string and returns a Result(ParseResult(String), String).

Example

run(chr_grab(c), "character")
// -> Ok(ParseResult(res: "c", rem: "haracter", idx: 1))
run(chr(x), "dogs are cool")
// -> Error("Error: did not find 'x' at 'dogs are cool")
pub fn chr_grab() -> Parser(String)

Parses for the first character in a string and returns a Result(ParseResult(String), String).

Example

run(chr_grab(), "character")
// -> Ok(ParseResult(res: "c", rem: "haracter", idx: 1))
run(chr_grab(), "")
// -> Error("Error: expected char, found none")
pub fn dgt(digit: Int) -> Parser(Int)

Parses a string for the first character as a single digit and Returns a Result(ParseResult(Int), String).

Example

run(dgt(7), "7 is a prime number")
// -> Ok(res: 7, rem: " is a prime number", idx: 1)
run(dgt(6), "There is no 6 here")
// -> Error("Error: expected '6' found 'T'")
pub fn lazy(thunk: fn() -> Parser(a)) -> Parser(a)

Defers construction of a parser until parse time, allowing you to write recursive or mutually-recursive grammars without forward declaration errors.

Example

// A very simple recursive “nesting” grammar
let rec nested() = lazy(fn() {
  btwn(chr("("), nested(), chr(")"))
})

run(nested(), "((()))")
// -> Ok(ParseResult(res: "()", rem: ")", idx: 4))  // can now handle recursion

Returns a Parser(a) that, when run, calls your thunk to get the real parser.

pub fn map(parser: Parser(a), fnc: fn(a) -> b) -> Parser(b)

This combinator is designed to transform parsers. The parser combinator takes in a parser of type Parser(a) and a function that transforms a to b to then return a parser of type Parser(b). This can be used to transform one function to another using the transformation function as some “bridge” for the computation.

Example

map(num(), fn(number) { int.to_base16(number) })
|> run("2024 was wild ngl")
// -> Ok(ParseResult(res: 7E8, rem: " was wild ngl", idx: 4))
map(chr_grab, fn(char) { string.to_utf_codepoints(char) })
|> run("2024 was wild ngl")
// -> Ok(ParseResult(res: 7E8, rem: " was wild ngl", idx: 4))
pub fn mny_chc(parserlist: List(Parser(a))) -> Parser(List(a))

Repeats “choice” among the given parsers zero or more times, collecting each successful result into a list. Always succeeds (even if no parser ever matches) and returns the list of all matches in order.

Example

// Parses any number of “a” or “b” in any order
let p = mny_chc([chr("a"), chr("b")])

run(p, "abbaacxyz")
// -> Ok(ParseResult(res: ["a","b","b","a","a","c"], rem: "xyz", idx: 6))

Returns a Parser(List(a)) with all values parsed in sequence.

pub fn mny_of(parser: Parser(a)) -> Parser(List(a))

Repeats runs a parser on a string until it “fails”. mny_of always succeeds and always returns a ParseResult(List(a)).

It is generic and can take in any parser kind.

Example

mny_of(chr(a))
|> run("aadvarks are cool")
// -> ParseResult(res: ["a", "a"], rem: "varks are cool", idx: 2)
mny_of(chr(a))
|> run("bojack is a bad horse")
// -> ParseResult(res: [], rem: "bojack is a bad horse", idx: 0)
pub fn num() -> Parser(Int)

Parses numbers at the start of a string and returns a Result(ParseResult(Int), String).

Examples

run(num(), "123abc")
// -> Ok(ParseResult(res: 123, rem: "abc", idx: 3))
run(num(), "abc123")
// -> Error("Error: no number captured")
pub fn opt_of(parser: Parser(a)) -> Parser(Option(a))

Takes in a parser of type Parser(a), runs the parser, and returns a ParseResult(Option(a))` of the parser. opt_of always succeeds and cannot return an Error type.

Example

opt_of(p.num())
|> run("1500 SAT isn't bad lmao")
// -> Ok(ParseResult(res: Some(1500), rem: " SAT isn't bad lmao", idx: 4))
opt_of(p.num())
|> run("but a 600 is crazy")
// -> Ok(ParseResult(res: None, rem: "but a 600 is crazy", idx: 0))
pub fn peek_fwd(parser: Parser(a)) -> Parser(a)

Takes in a parser and looks ahead for the result of the successfully ran parser value and returns a Result(ParseResult(a), String).

It is generic and can take in any Parser(a) kind.

Example

peek_fwd(str("goat cheese"))
|> run("goat cheese tastes good")
// -> Ok(ParseResult("goat cheese", "goat cheese tastes good", 0))
peek_fwd(str("goat cheese"))
|> run("Bungee gum is somewhat rubber")
// -> Error("Error: given string does not start with 'rubber'")
pub fn prefix_str(pattern: String) -> Parser(String)

Parses a string for a given substring at the start of the string and returns a Result(ParseResult(String), String).

Example

run(prefix_str("race"), "racecars are cool")
// -> Ok(ParseResult(res: "race", rem: "cars are cool", idx: 4))
run(prefix_str(spoon), "grasshoppers suck")
// -> Error("Error: given string does not start with 'spoon'")
pub fn print_array_string(list: List(String)) -> Nil

Parser(List(String)) helper function

Prints out a list of strings

This function is intended to be used alongside string parsers and string

parsers only.

Example

string_to_int(["random", "list", "of", "strings"])
// -> random, list, of, strings
pub fn run(
  fnc: Parser(a),
  str: String,
) -> Result(ParseResult(a), String)
pub fn sat_pred(
  parser: Parser(a),
  fnc: fn(a) -> Bool,
) -> Parser(a)

This generic combinator over type a takes a parser of type Parser(a) and a boolean function. The parser result will only be returned for values that satisfy the predicate.

Example

seq_of([chr("b"), chr("o"), chr("b")])
|> map(fn(list_chr){ string.concat(list_chr) })
|> sat_pred(fn(chr) { chr == "bob" })
|> run("bobbit is nice")
// -> Ok(ParseResult(res: "bob", rem: "bit is nice", idx: 3))
sat_pred(num(), fn(num) { num < 5 })
|> run("6 cold cans of soda")
// -> Error("Error: unsatisfied predicate") 
pub fn sep_by(item: Parser(a), sep: Parser(b)) -> Parser(List(a))

This combinator takes in two parsers generic over type a and b. Parser(a) parses a string for its desired result, and Parser(b), if successfull, is skipped and returns a Ok(ParseResult(List(a))). This parser always succeeds and always returns a Ok(ParseResult(List(a))).

This is generic over a and b.

Example

sat_pred(chr_grab, fn(chr) { chr != "," })
|> sep(str(","))
|> run("eggs, bacon, and more")
// -> Ok(ParseResult(res: ["eggs", " bacon", " and more"], rem: "", idx: 21))
sat_pred(chr_grab, fn(chr) { chr != "," })
|> sep(str(","))
|> run("No seperator yet")
// -> Ok(ParseResult(res: ["No seperator yet"], rem: "", idx: 16))
pub fn seq_of(parserlist: List(Parser(a))) -> Parser(List(a))

This combinator takes a list of parsers of the same type, runs each parser sequentially, and returns a parser that parses a string for the result of the sequential parser. If one parser in the sequence fails, the entire parser fails. The resulting parser returns a Result(ParseResult(a), String)

Example

seq_of([str(hi), chr(-), str(five)])
|> run("hi-five's for the good job")
// Ok(ParseResult(res: ["hi", "-", "five"], rem: "'s for the good job", idx: 7))
seq_of([str("nuh-uh"), chr(" "), str("bro")])
|> run("What do you mean by that?")
// -> Error("Error: could not match parser sequence")
pub fn skip(parser1: Parser(a), parser2: Parser(b)) -> Parser(b)

This combinator takes in two parsers generic over type a and b respectively. This returns a parser that runs the first parser. if successful, the result of the first parser is ignored and the second parser is run. This function returns a function of return type Result(ParseResult(b), String).

Example

skip(str("yo"), str("gurt"))
|> run("yogurt, what's up dude")
// -> Ok(ParseResult(res: "gurt", rem: ", what's up dude", idx: 6))
skip(str("yo"), str("gurt"))
|> run("hey gurt, what's up dude")
// -> Error("Error: given string does not start with 'yo'")
pub fn str(pattern: String) -> Parser(String)

Parses a string for a given substring at the start of the string and returns a Result(ParseResult(String), String).

Example

run(str("race"), "racecars are cool")
// -> Ok(ParseResult(res: "race", rem: "cars are cool", idx: 4))
run(str(spoon), "grasshoppers suck")
// -> Error("Error: given string does not start with 'spoon'")
pub fn string_to_int(str: String) -> Int

num() & dgt(int) helper function

Converts a string to an Int.

This function is intended to be used alongside num parsers and num parsers only.

Example

string_to_int("247")
// -> 247
pub fn tok(parser: Parser(a)) -> Parser(a)

Parses a token but skips any surrounding whitespace. It will:

  1. Skip leading whitespace (wht_space)
  2. Run your parser
  3. Skip trailing whitespace (wht_space)
    and then return the parser’s result.

Example

let p = tok(str("let"))
|> bind(fn(_) { chr("=") })

run(p, "   let   =x")
// -> Ok(ParseResult(res: "=", rem: "x", idx: 8))

Returns a Parser(a) that parses a with optional padding.

pub fn wht_space() -> Parser(String)

Parses single white space “ “ and “\t” white-space characters at the beginning of a string and returns a Result(ParseResult(String), String)

When there is no white space character, wht_spc simply returns and Ok(ParseResult(String)) with no updates to the original ParseState variables.

Examples

run(wht_space(), "\tstarting white-space")
// -> Ok(ParseResult(res: "\t", rem: "starting white-space", idx: 1))
run(wht_space(), "no white-space")
// -> Ok(ParseResult(res: "", rem: "no white-space", idx: 0))
Search Document