dataprep/rules
Types
Why a checked regex constructor refused to build a validator.
Returned by matches_string_checked and
matches_fully_string_checked instead of panicking, so the
caller controls how a malformed pattern surfaces. Mirrors the
information from regexp.CompileError without forcing the
caller to depend on gleam/regexp directly.
pub type RegexRuleError {
InvalidPattern(reason: String, byte_index: Int)
}
Constructors
-
InvalidPattern(reason: String, byte_index: Int)
Values
pub fn equals(
expected expected: a,
error error: e,
) -> fn(a) -> validated.Validated(a, e)
Fails if the value does not equal the expected value.
pub fn length_between(
minimum min: Int,
maximum max: Int,
error error: e,
) -> fn(String) -> validated.Validated(String, e)
Fails if the string length is outside [min, max].
Edge case — min > max: the resulting validator is vacuously
unsatisfiable (no string length can be both >= min and <= max),
so every input fails with error. This is a programmer error
rather than a runtime condition; the library does not raise it
because rule constructors are pure and never panic. Construct the
rule yourself with a sane range, or guard the bounds at the call
site (e.g., case min <= max { True -> ...; False -> ... }) when
min/max come from configuration or other dynamic input.
pub fn matches(
pattern re: regexp.Regexp,
error error: e,
) -> fn(String) -> validated.Validated(String, e)
Fails if the regex does not find a match anywhere in the string.
This is regexp.check semantics — a partial / substring match
is enough to pass.
Anchoring is the caller’s responsibility. A pattern like
[0-9]+ accepts "abc123def" because the digit run matches
somewhere in the input. To require the entire string to match,
either anchor the pattern explicitly with ^...$, or reach for
matches_fully which enforces full-string semantics regardless
of whether the pattern is anchored. The validation use case
almost always wants matches_fully; matches is exposed for
the cases that genuinely need substring search.
Takes a pre-compiled Regexp so a malformed pattern surfaces as
a regexp.from_string error at the call site instead of
crashing inside the validator.
Example: import gleam/regexp import dataprep/rules
let assert Ok(re) = regexp.from_string(“^[a-z]+$”) let check = rules.matches(pattern: re, error: InvalidFormat)
For literal patterns where propagating a compile error to the
caller is not useful, see matches_string.
pub fn matches_fully(
pattern re: regexp.Regexp,
error error: e,
) -> fn(String) -> validated.Validated(String, e)
Fails if the regex does not match the entire input string. Unlike
matches, a partial / substring hit is not enough — the
matched substring must equal the whole input. Equivalent to
Python’s re.fullmatch semantics.
Anchoring the pattern (^...$) is therefore not required: the
validator does the equivalent check itself by comparing the first
match’s content against the input. Patterns that already include
^ / $ continue to work; the anchors just become redundant.
Example: import gleam/regexp import dataprep/rules
let assert Ok(re) = regexp.from_string(“[0-9]+”) let check = rules.matches_fully(pattern: re, error: NotANumber)
check(“123”) // Valid(“123”) check(“abc123def”) // Invalid([NotANumber]) – substring match rejected
pub fn matches_fully_string(
pattern pattern: String,
error error: e,
) -> fn(String) -> validated.Validated(String, e)
Fails if the regex does not match the entire input string. Compiles the pattern internally; an invalid pattern panics at construction time with the underlying compile error.
Equivalent to matches_fully but with a literal pattern. Use
this for the validation use case — "[0-9]+" will reject
"abc123def" rather than accepting it on a substring hit, so
the API behaves the way readers usually assume.
Example: import dataprep/rules
let check = rules.matches_fully_string( pattern: “[a-z0-9-]+”, error: InvalidFormat, )
pub fn matches_fully_string_checked(
pattern pattern: String,
error error: e,
) -> Result(
fn(String) -> validated.Validated(String, e),
RegexRuleError,
)
Like matches_fully_string, but returns the compile failure as
a Result instead of panicking. Use this for the validation use
case when the pattern comes from configuration or any other
dynamic source where a compile failure should be handled rather
than crashing.
On success the validator behaves identically to
matches_fully(pattern: re, error:) over the compiled pattern.
Example: import dataprep/rules
case rules.matches_fully_string_checked( pattern: pattern_from_admin, error: BadFormat, ) { Ok(check) -> … Error(rules.InvalidPattern(reason: r, ..)) -> … }
pub fn matches_string(
pattern pattern: String,
error error: e,
) -> fn(String) -> validated.Validated(String, e)
Fails if the string does not match the given regular expression pattern. Compiles the pattern internally; an invalid pattern panics at construction time with the underlying compile error.
Same anchoring footgun as matches: a pattern like
"[0-9]+" accepts "abc123def" because the digit run matches
somewhere. Anchor explicitly with ^...$ or use
matches_fully_string for the validation case.
Use this when the pattern is a literal known at the call site
and a compile failure would be a programmer error there is no
meaningful recovery from. For dynamically-supplied patterns,
use matches together with regexp.from_string so the
Result is visible.
Example: import dataprep/rules
let check = rules.matches_string( pattern: “^[a-z0-9-]+$”, error: InvalidFormat, )
pub fn matches_string_checked(
pattern pattern: String,
error error: e,
) -> Result(
fn(String) -> validated.Validated(String, e),
RegexRuleError,
)
Like matches_string, but returns the compile failure as a
Result instead of panicking. Use this when the pattern is not
a hard-coded literal — config files, admin-supplied input, or
anywhere a malformed pattern is a recoverable runtime condition
rather than a programmer error.
On success the validator behaves identically to
matches(pattern: re, error:) over the compiled pattern, i.e.
uses regexp.check semantics (substring hit is enough).
Example: import dataprep/rules import gleam/result
case rules.matches_string_checked( pattern: pattern_from_config, error: InvalidFormat, ) { Ok(check) -> handle_request(check) Error(rules.InvalidPattern(reason: r, ..)) -> reject_config(r) }
pub fn max_float(
maximum max: Float,
error error: e,
) -> fn(Float) -> validated.Validated(Float, e)
Fails if the float exceeds max.
pub fn max_int(
maximum max: Int,
error error: e,
) -> fn(Int) -> validated.Validated(Int, e)
Fails if the int exceeds max.
pub fn max_length(
maximum max: Int,
error error: e,
) -> fn(String) -> validated.Validated(String, e)
Fails if the string length exceeds max.
pub fn min_float(
minimum min: Float,
error error: e,
) -> fn(Float) -> validated.Validated(Float, e)
Fails if the float is less than min.
pub fn min_int(
minimum min: Int,
error error: e,
) -> fn(Int) -> validated.Validated(Int, e)
Fails if the int is less than min.
pub fn min_length(
minimum min: Int,
error error: e,
) -> fn(String) -> validated.Validated(String, e)
Fails if the string length is less than min.
pub fn non_negative_float(
error: e,
) -> fn(Float) -> validated.Validated(Float, e)
Fails if the float is negative (less than 0.0).
pub fn non_negative_int(
error: e,
) -> fn(Int) -> validated.Validated(Int, e)
Fails if the int is negative (less than 0).
pub fn not_blank(
error: e,
) -> fn(String) -> validated.Validated(String, e)
Fails if the string is empty or contains only whitespace.
Unlike not_empty, this rejects " " and "\t\n".
The value is NOT trimmed – it is returned unchanged on success.
pub fn not_empty(
error: e,
) -> fn(String) -> validated.Validated(String, e)
Fails if the string is exactly “”. Whitespace-only strings like “ “ pass this check. To reject whitespace-only input, compose with prep.trim() first:
raw |> prep.trim() |> rules.not_empty(MyError)
pub fn one_of(
allowed allowed: List(a),
error error: e,
) -> fn(a) -> validated.Validated(a, e)
Fails if the value is not in the allowed list.
Edge case — empty allowed list: a set-membership check against
the empty set has no inhabitants, so every input fails with
error. This is a programmer error rather than a runtime
condition; the library does not raise it because rule constructors
are pure and never panic. Either construct the rule with a
non-empty allowlist, or guard at the call site (e.g.,
case allowed { [] -> ...; [_, ..] -> rules.one_of(allowed, e) })
when the allowlist comes from configuration or other dynamic
input.