Delimit.Schema (delimit v0.4.0)
View SourceDefines schema structures and functions for working with delimited data.
This module handles schema definitions, data type conversions, and transformations between delimited data and Elixir structs.
Summary
Functions
Adds an embedded schema to the parent schema.
Adds a field to the schema.
Default delimiter used by canonical_string/3 and row_hash/3.
Returns a stable string encoding of a struct based on its schema.
Returns a flat list of {display_name, Field.t()} tuples for all leaf fields,
including fields from flattened embeds.
Gets field names in order of definition.
Returns a list of {Field.t(), start_offset, width} tuples using cumulative offsets.
Gets the header prefix for an embedded field.
Gets all embedded fields defined in the schema.
Gets a field by name.
Gets the headers for the schema.
Creates a new schema definition.
Returns a binary cryptographic hash of a struct's canonical encoding.
Converts a struct or map to a row of values based on the schema.
Converts a row of data to a struct based on the schema.
Builds a struct (including embeds) from a flat list of raw string values.
Converts a row of data to a struct based on the schema, using headers for field mapping.
Converts a field type to an Elixir typespec.
Validates that all fields (including flattened embed fields) have a positive integer width: option.
Types
@type schema_options() :: [ delimiter: String.t(), skip_lines: non_neg_integer(), skip_while: (String.t() -> boolean()), trim_fields: boolean(), nil_on_empty: boolean(), line_ending: String.t(), format: atom() ]
Options for schema handling.
:delimiter- Field delimiter character (default: comma):skip_lines- Number of lines to skip at beginning of file:skip_while- Function to determine which lines to skip:trim_fields- Whether to trim whitespace from fields (default: true):nil_on_empty- Convert empty strings to nil (default: true):line_ending- Line ending character(s) for output:format- Predefined format (:csv,:tsv,:psv) that sets appropriate options
@type t() :: %Delimit.Schema{ embeds: %{required(atom()) => module()}, fields: [Delimit.Field.t()], module: module(), options: schema_options() }
Schema definition structure.
:module- The module associated with the schema:fields- List of field definitions:options- Additional options for the schema:embeds- Map of module references for embedded schemas
Functions
Adds an embedded schema to the parent schema.
Parameters
schema- The parent schema to add the embedded schema toname- The name for the embedded schema as an atommodule- The module defining the embedded schemaopts- Options for the embedded schema
Returns
- Updated schema structure
Adds a field to the schema.
Parameters
schema- The schema to add the field toname- The name of the field as an atomtype- The type of the field (:string, :integer, etc.)opts- Options for the field
Returns
- Updated schema structure
Default delimiter used by canonical_string/3 and row_hash/3.
ASCII Unit Separator (0x1F) — chosen because it is highly unlikely to appear in real-world delimited file content, so the canonical encoding remains unambiguous regardless of the file's actual delimiter.
Returns a stable string encoding of a struct based on its schema.
The encoding is deterministic for a given schema and struct content:
- Fields appear in schema definition order.
- Each field's value is encoded as it would be written to a file
(using configured
format:/formats:/write_fn, etc.). nilvalues encode as the empty string.- Embedded schemas contribute their own canonical encoding recursively (in their declared schema order, no prefix).
- Derived field types (
:row_hash,:raw_row) are skipped — their values come from the parsed source row, not from canonical state.
Options
:delimiter— the separator between encoded field values. Defaults toDelimit.Schema.canonical_delimiter/0(ASCII Unit Separator). Usedelimiter: "|"if you want a readable form (at the cost of ambiguity if any field value contains the chosen delimiter).
Example
iex> %MyApp.Person{first_name: "Alice", age: 30}
...> |> MyApp.Person.canonical_string()
"Alice<US>30"
@spec collect_all_fields(t()) :: [{atom(), Delimit.Field.t()}]
Returns a flat list of {display_name, Field.t()} tuples for all leaf fields,
including fields from flattened embeds.
For regular fields, display_name is the field name atom.
For embed fields, display_name includes the embed prefix (e.g., :billing_address_street).
Gets field names in order of definition.
Parameters
schema- The schema definition
Returns
- List of field names as atoms
@spec field_widths(t()) :: [{Delimit.Field.t(), non_neg_integer(), pos_integer()}]
Returns a list of {Field.t(), start_offset, width} tuples using cumulative offsets.
Used by the fixed-width reader to slice lines into field values.
@spec get_embed_prefix(Delimit.Field.t(), String.t() | nil) :: String.t()
Gets the header prefix for an embedded field.
Parameters
field- The embedded field definitiondefault_prefix- Default prefix to use if none specified
Returns
- String prefix to use for field headers
@spec get_embeds(t()) :: [Delimit.Field.t()]
Gets all embedded fields defined in the schema.
Parameters
schema- The schema definition
Returns
- List of embedded field definitions
@spec get_field(t(), atom()) :: Delimit.Field.t() | nil
Gets a field by name.
Parameters
schema- The schema definitionname- The field name to find
Returns
- The field definition or nil if not found
Gets the headers for the schema.
Parameters
schema- The schema definitionprefix- Optional prefix to apply to all headers
Returns
- List of header strings
Example
iex> schema = Delimit.Schema.new(MyApp.Person)
iex> schema = Delimit.Schema.add_field(schema, :name, :string)
iex> schema = Delimit.Schema.add_field(schema, :age, :integer)
iex> Delimit.Schema.headers(schema)
["name", "age"]
iex> Delimit.Schema.headers(schema, "person_")
["person_name", "person_age"]
@spec new(module(), schema_options()) :: t()
Creates a new schema definition.
Parameters
module- The module associated with the schemaoptions- Options for the schema
Returns
- A new schema structure
Returns a binary cryptographic hash of a struct's canonical encoding.
Options
:algorithm— hash algorithm passed to:crypto.hash/2. Default:sha256.:truncate— bytes to truncate to. Default16.nilmeans no truncation.
See canonical_string/3 for the encoding rules.
Converts a struct or map to a row of values based on the schema.
Parameters
schema- The schema definitionstruct_or_map- A struct or map containing field values
Returns
- A list of field values
Examples
iex> schema = Delimit.Schema.new(MyApp.Person)
iex> schema = Delimit.Schema.add_field(schema, :name, :string)
iex> Delimit.Schema.to_row(schema, %{name: "John Doe"})
["John Doe"]
Converts a row of data to a struct based on the schema.
Parameters
schema- The schema definitionrow- A list of field values or a map of field name/values
Returns
- A struct based on the schema with field values
Example
iex> schema = Delimit.Schema.new(MyApp.Person)
iex> schema = Delimit.Schema.add_field(schema, :name, :string)
iex> schema = Delimit.Schema.add_field(schema, :age, :integer)
iex> Delimit.Schema.to_struct(schema, ["John Doe", "42"])
%MyApp.Person{name: "John Doe", age: 42}
Builds a struct (including embeds) from a flat list of raw string values.
This is needed for fixed-width format where fields are position-based rather than
header-based. Uses Field.parse_value/2 for each value.
Converts a row of data to a struct based on the schema, using headers for field mapping.
Parameters
schema- The schema definitionrow- A list of field valuesheaders- A list of header strings matching the row fieldsopts- Additional options for processing
Returns
- A struct based on the schema with field values
Converts a field type to an Elixir typespec.
This function is used to convert field types to proper Elixir typespecs for use in @type definitions.
Parameters
type- The field type or a tuple with more specific type information
Returns
- An Elixir typespec expression
Example
iex> Delimit.Schema.type_to_typespec(:string)
quote do: String.t()
iex> Delimit.Schema.type_to_typespec({:list, :string})
quote do: [String.t()]
@spec validate_fixed_width!(t()) :: :ok
Validates that all fields (including flattened embed fields) have a positive integer width: option.
Raises ArgumentError if any field is missing width: or has a non-positive width.