yabase/intid
Integer helpers for short URL-safe identifiers.
The byte-oriented codecs in yabase/facade are the right tool when
the input is opaque bytes (hashes, public keys, raw payloads). For
the very common short-ID case — DB autoincrement ids, sequence
numbers, hash truncations — callers want Int -> compact string
directly. Without these helpers every project re-implements the
same Int -> big-endian bytes -> trim-leading-zero shim.
encode_int_* emits canonical form: no leading zero characters
beyond what the value itself requires (encode_int_base58(0) == "1", the alphabet’s zero character; encode_int_base58(58) == "21", no leading "1").
decode_int_* is tolerant of leading zero characters
(decode_int_base58("0042") and decode_int_base58("42") both
return the same Int), so input from external sources that
zero-pads is accepted without ceremony.
decode_int_* rejects the empty string with
Error(InvalidLength(0)) rather than treating it as zero. Callers
can therefore distinguish “no ID was supplied” from “the ID is
zero” — important for URL routing, form parsing, and database
lookups. The byte-oriented decoders in yabase/facade retain the
Ok(<<>>) round-trip behavior for empty input.
Negative inputs are rejected
Every encode_int_* function in this module returns
Result(String, CodecError) and rejects negative inputs with
Error(NegativeValue(value)). The integer codecs only define a
canonical representation for non-negative values, so silently
dropping the sign would break the decode(encode(n)) == n
round-trip whenever n < 0 (closed #84, reopened as #100).
If your caller path can produce negatives (offsets that
subtracted past zero, Posix timestamps from before 1970,
deliberate -1 sentinels), map them to a sign-preserving wire
format like zigzag or to a domain-specific error at the
boundary:
case intid.encode_int_base32_crockford(n) {
Ok(s) -> Ok(s)
Error(intid.NegativeValue(_)) -> Error(MyDomainError.NegativeId(n))
Error(other) -> Error(MyDomainError.CodecFailed(other))
}
Bounded decode
decode_int_* accepts inputs of any length, so the decoded
Int can exceed any fixed integer width — Erlang Int is a
bignum. Realistic backing stores cap IDs at 64 bits (SQLite
INTEGER, Postgres bigserial, MySQL BIGINT), so feeding an
unbounded decode_int_* result into one of those columns
crashes the driver as soon as a user supplies a slightly-too-long
string. For the same reason, JavaScript-target callers cap at
53 bits (Number.MAX_SAFE_INTEGER).
Use decode_int_*_bounded(input:, max:) whenever the decoded value
flows into a fixed-width sink. The bounded variants return
Error(Overflow) if the decoded Int exceeds max. Common caps
are exported as int64_max (signed 64-bit, 2^63 - 1) and
int53_max (JS-safe integer, 2^53 - 1).
Types
Issue #74: every decode_int_* function in this module returns
Result(Int, CodecError). Without this re-export, callers who only
import yabase/intid cannot type-annotate a wrapper around a
decode call without reaching into yabase/core/error — a module
the README does not mention. The alias keeps the type identity
(it’s the same CodecError the underlying codec functions
already use) so error values flow through unchanged.
pub type CodecError =
error.CodecError
Values
pub fn decode_int(
encoding encoding: encoding.Encoding,
value value: String,
) -> Result(Int, error.CodecError)
Decode a string back to an Int using the supplied Encoding,
dispatching to the matching decode_int_* helper.
Empty input returns Error(InvalidLength(0)) so callers can
distinguish “no ID was supplied” from “the ID is zero” — the
same contract as the per-base helpers.
Returns Error(UnsupportedForInt(name)) for encodings that have
no integer codec wired up.
pub fn decode_int_base10(
input: String,
) -> Result(Int, error.CodecError)
Decode a Base10 (decimal) string back to an Int.
pub fn decode_int_base10_bounded(
input input: String,
max max: Int,
) -> Result(Int, error.CodecError)
Decode a Base10 (decimal) string back to an Int, rejecting
values greater than max with Error(Overflow).
pub fn decode_int_base16(
input: String,
) -> Result(Int, error.CodecError)
Decode a Base16 (hexadecimal) string back to an Int. Accepts
both uppercase and lowercase input via base16.decode’s
case-insensitive alphabet.
Odd-length inputs are accepted and internally zero-padded on the
left to the next byte boundary before being passed to
base16.decode ("1" is treated as "01", "7E9" as
"07E9"). This makes the function tolerant of either output
shape from the encode_int_base16* family —
decode_int_base16(encode_int_base16(n)) == Ok(n) and
decode_int_base16(encode_int_base16_compact(n)) == Ok(n) both
hold for every non-negative Int. The byte-oriented
base16.decode/1 keeps its strict even-length contract for
callers reaching for the low-level codec directly. (#99)
pub fn decode_int_base16_bounded(
input input: String,
max max: Int,
) -> Result(Int, error.CodecError)
Decode a Base16 (hexadecimal) string back to an Int, rejecting
values greater than max with Error(Overflow).
pub fn decode_int_base32_crockford(
input: String,
) -> Result(Int, error.CodecError)
Decode a Crockford Base32 string back to an Int.
pub fn decode_int_base32_crockford_bounded(
input input: String,
max max: Int,
) -> Result(Int, error.CodecError)
Decode a Crockford Base32 string back to an Int, rejecting
values greater than max with Error(Overflow).
pub fn decode_int_base32_crockford_check(
input: String,
) -> Result(Int, error.CodecError)
Decode a checksummed Crockford Base32 string back to an Int,
verifying the trailing check symbol.
pub fn decode_int_base32_crockford_check_bounded(
input input: String,
max max: Int,
) -> Result(Int, error.CodecError)
Decode a checksummed Crockford Base32 string back to an Int,
rejecting values greater than max with Error(Overflow).
pub fn decode_int_base32_rfc4648(
input: String,
) -> Result(Int, error.CodecError)
Decode a Base32 (RFC 4648) string back to an Int.
pub fn decode_int_base32_rfc4648_bounded(
input input: String,
max max: Int,
) -> Result(Int, error.CodecError)
Decode a Base32 (RFC 4648) string back to an Int, rejecting
values greater than max with Error(Overflow).
pub fn decode_int_base36(
input: String,
) -> Result(Int, error.CodecError)
Decode a Base36 string back to an Int.
pub fn decode_int_base36_bounded(
input input: String,
max max: Int,
) -> Result(Int, error.CodecError)
Decode a Base36 string back to an Int, rejecting values
greater than max with Error(Overflow).
pub fn decode_int_base58(
input: String,
) -> Result(Int, error.CodecError)
Decode a Base58 (Bitcoin alphabet) string back to an Int,
rejecting non-canonical wire forms with Error(NonCanonical).
The Bitcoin Base58 alphabet uses "1" as the zero character, so
the byte-oriented base58_bitcoin.decode prepends one 0x00 byte
for every leading "1" in the input. When that byte string is
read back as a big-endian integer the leading zero bytes
disappear, which means "5Q", "15Q", "115Q", … all decode to
the same Int. That collapses the bijection that ID callers
(URL shorteners, idempotency keys, database lookups) rely on:
two different wire strings can name the same row, breaking
deduplication and cache invariants.
The fix is to require the input to be byte-equal to the
canonical encoding (encode_int_base58(decoded) == input). The
only legal leading "1" is the single-character input "1",
which is the canonical encoding of 0. Any other input that
starts with "1" is Error(NonCanonical). Closes #101.
pub fn decode_int_base58_bounded(
input input: String,
max max: Int,
) -> Result(Int, error.CodecError)
Decode a Base58 (Bitcoin alphabet) string back to an Int,
rejecting values greater than max with Error(Overflow).
pub fn decode_int_base58_flickr(
input: String,
) -> Result(Int, error.CodecError)
Decode a Base58 (Flickr alphabet) string back to an Int.
pub fn decode_int_base58_flickr_bounded(
input input: String,
max max: Int,
) -> Result(Int, error.CodecError)
Decode a Base58 (Flickr alphabet) string back to an Int,
rejecting values greater than max with Error(Overflow).
pub fn decode_int_base58check(
input: String,
) -> Result(Int, error.CodecError)
Decode a Base58Check string back to an Int, verifying the
4-byte SHA-256 checksum.
Issue #73: returns the payload as an Int, ignoring the version
byte (which encode_int_base58check always sets to 0).
Callers that need to inspect the version byte should reach for
yabase/base58check.decode/1 directly.
pub fn decode_int_base58check_bounded(
input input: String,
max max: Int,
) -> Result(Int, error.CodecError)
Decode a Base58Check string back to an Int, rejecting payload
values greater than max with Error(Overflow). The checksum is
verified before the bounds check, so a corrupted input fails as
InvalidChecksum rather than Overflow.
pub fn decode_int_base62(
input: String,
) -> Result(Int, error.CodecError)
Decode a Base62 string back to an Int.
pub fn decode_int_base62_bounded(
input input: String,
max max: Int,
) -> Result(Int, error.CodecError)
Decode a Base62 string back to an Int, rejecting values
greater than max with Error(Overflow).
pub fn decode_int_bounded(
encoding encoding: encoding.Encoding,
value value: String,
max max: Int,
) -> Result(Int, error.CodecError)
Decode a string back to an Int using the supplied Encoding,
rejecting values greater than max with Error(Overflow). The
runtime sibling of the per-base decode_int_*_bounded helpers.
pub fn encode_int(
encoding encoding: encoding.Encoding,
value value: Int,
) -> Result(String, error.CodecError)
Encode an Int to a string using the supplied Encoding,
dispatching to the matching encode_int_* helper. Negative
inputs surface as Error(NegativeValue(value)) exactly as the
per-base helpers do; see the module note on “Negative inputs
are rejected” for the rationale and the boundary-handling
pattern.
Returns Error(UnsupportedForInt(name)) for encodings that have
no integer codec wired up (every byte-only codec: Base2,
Base8, Base32(Hex|Clockwork|ZBase32), Base45, every
Base64 / Base85 variant, Base91, Bech32). For
Base58Check, the existing encode_int_base58check/1 helper
uses the fixed version byte 0x00; reach for the generic
facade with encoding.base58_check(version) to pin a different
version.
pub fn encode_int_base10(
value: Int,
) -> Result(String, error.CodecError)
Encode a non-negative Int as a Base10 (decimal) string.
Returns Error(NegativeValue(value)) for negative inputs; see
the module note on “Negative inputs are rejected”.
Behaviour matches int.to_string for the typical case
(positive integers) and the rest of the intid family for the
switch-case bench harnesses described in #78. Routing through
base10.encode keeps the contract uniform with the other
encode_int_* functions: a non-negative Int in, a string
in the alphabet out, no padding.
pub fn encode_int_base16(
value: Int,
) -> Result(String, error.CodecError)
Encode a non-negative Int as a Base16 (uppercase hexadecimal)
string. Returns Error(NegativeValue(value)) for negative
inputs; see the module note on “Negative inputs are rejected”.
Routing through base16.encode keeps the contract uniform with
the rest of the encode_int_* family. The output uses the
canonical RFC 4648 §8 uppercase alphabet (0-9 A-F) — callers
who need lowercase for interop with sha256sum-style tools can
post-process with string.lowercase or use base16.encode_lowercase
after int_to_bytes_be themselves.
Byte-aligned vs compact
This function is byte-aligned: the output length is always an
even number of hex characters because the encoding pads the
integer’s big-endian representation to a whole byte boundary
(encode_int_base16(1) == Ok("01"),
encode_int_base16(2025) == Ok("07E9")). This is the right
shape for ID interop with byte-oriented systems (databases, HTTP
headers, content-addressable storage). The other
encode_int_* functions in this module are compact — they
drop leading zero characters
(encode_int_base58(1) == Ok("2"),
encode_int_base36(1) == Ok("1"),
encode_int_base10(1) == Ok("1")). Issue #99 surfaced the
asymmetry; if you want the compact form for base16, use
encode_int_base16_compact/1 instead. decode_int_base16/1
accepts either form (it does not require an even-length input).
pub fn encode_int_base16_compact(
value: Int,
) -> Result(String, error.CodecError)
Encode a non-negative Int as a Base16 (uppercase hexadecimal)
string with leading zero characters stripped — the compact
counterpart to encode_int_base16/1.
Byte-aligned vs compact
encode_int_base16/1 is byte-aligned (always emits an even
number of hex characters: encode_int_base16(1) == Ok("01"),
encode_int_base16(2025) == Ok("07E9")). This function is
compact — leading "0" characters are dropped so the
output matches the shape of the rest of the encode_int_*
family (encode_int_base58(1) == Ok("2"),
encode_int_base36(1) == Ok("1"),
encode_int_base10(1) == Ok("1")). Examples:
encode_int_base16_compact(0) // Ok("0")
encode_int_base16_compact(1) // Ok("1")
encode_int_base16_compact(255) // Ok("FF")
encode_int_base16_compact(2025) // Ok("7E9")
encode_int_base16_compact(0xdeadbeef) // Ok("DEADBEEF")
Use this when you want column-aligned mixed-base output or
round-trip-by-text comparisons across the encode_int_* family.
Keep encode_int_base16/1 for byte-oriented sinks where the
even-length contract matters.
decode_int_base16/1 accepts the compact form unchanged
(decode_int_base16(encode_int_base16_compact(n) |> result.unwrap_or(""))
round-trips for every non-negative Int), because the
underlying base16.decode is tolerant of any-length input —
odd-length inputs are zero-padded to the next byte boundary
before decoding.
Returns Error(NegativeValue(value)) for negative inputs; see
the module note on “Negative inputs are rejected”.
Added in #99.
pub fn encode_int_base32_crockford(
value: Int,
) -> Result(String, error.CodecError)
Encode a non-negative Int as a Crockford Base32 string.
Returns Error(NegativeValue(value)) for negative inputs; see
the module note on “Negative inputs are rejected”.
pub fn encode_int_base32_crockford_check(
value: Int,
) -> Result(String, error.CodecError)
Encode a non-negative Int as a Crockford Base32 string with a
trailing checksum symbol (Douglas Crockford’s optional check
character).
Issue #73: same shape as encode_int_base32_crockford but with
the typo-resistance guard the underlying codec already supports.
Use the matching decode_int_base32_crockford_check to recover
the integer; the decoder verifies the symbol and returns
Error(InvalidChecksum) if the input was mistyped.
Returns Error(NegativeValue(value)) for negative inputs; see
the module note on “Negative inputs are rejected”.
pub fn encode_int_base32_rfc4648(
value: Int,
) -> Result(String, error.CodecError)
Encode a non-negative Int as a Base32 (RFC 4648) string.
Returns Error(NegativeValue(value)) for negative inputs; see
the module note on “Negative inputs are rejected” for the
rationale and the recommended boundary-check pattern.
pub fn encode_int_base36(
value: Int,
) -> Result(String, error.CodecError)
Encode a non-negative Int as a Base36 string. Returns
Error(NegativeValue(value)) for negative inputs; see the
module note on “Negative inputs are rejected”.
pub fn encode_int_base58(
value: Int,
) -> Result(String, error.CodecError)
Encode a non-negative Int as a Base58 (Bitcoin alphabet)
string. Returns Error(NegativeValue(value)) for negative
inputs; see the module note on “Negative inputs are rejected”.
pub fn encode_int_base58_flickr(
value: Int,
) -> Result(String, error.CodecError)
Encode a non-negative Int as a Base58 (Flickr alphabet)
string. Returns Error(NegativeValue(value)) for negative
inputs; see the module note on “Negative inputs are rejected”.
pub fn encode_int_base58check(
value: Int,
) -> Result(String, error.CodecError)
Encode a non-negative Int as a Base58Check string (Bitcoin’s
double-SHA-256 checksum format).
Issue #73: this is the int-typed counterpart of
yabase/base58check.encode/2. Version is fixed at 0
(Bitcoin mainnet P2PKH) — callers that need a different version
should reach for yabase/base58check.encode/2 directly with their
own BitArray payload.
Returns Error(NegativeValue(value)) for negative inputs; see
the module note on “Negative inputs are rejected”. The underlying
yabase/base58check.encode only errors on out-of-range version
bytes, and this helper hard-codes a valid one, so the surfaced
error is always NegativeValue in practice.
pub fn encode_int_base62(
value: Int,
) -> Result(String, error.CodecError)
Encode a non-negative Int as a Base62 string. Returns
Error(NegativeValue(value)) for negative inputs; see the
module note on “Negative inputs are rejected”.
pub const int53_max: Int
Largest value that round-trips losslessly through a JavaScript
number (2^53 - 1, Number.MAX_SAFE_INTEGER). Use as the
max argument to decode_int_*_bounded when the decoded value
is passed across a JS-target boundary or serialized as JSON for
a JavaScript consumer.