RFC 5322 email message parser for Elixir, built with NimbleParsec.

Installation

Add mailex to your list of dependencies in mix.exs:

def deps do
  [
    {:mailex, "~> 0.1.0"}
  ]
end

Usage

Parsing an email

raw_email = """
From: sender@example.com
To: recipient@example.com
Subject: Hello World
Content-Type: text/plain

This is the message body.
"""

{:ok, message} = Mailex.parse(raw_email)

Parsing with exceptions

message = Mailex.parse!(raw_email)

API

Mailex.parse/1

@spec parse(binary()) :: {:ok, Mailex.Message.t()} | {:error, term()}

Parses a raw email message string into a Mailex.Message struct. Returns {:ok, message} on success, {:error, reason} on failure.

Mailex.parse!/1

@spec parse!(binary()) :: Mailex.Message.t()

Parses a raw email message string, raising on failure.

Message Structure

The parser returns a %Mailex.Message{} struct with the following fields:

%Mailex.Message{
  headers: %{
    "from" => "sender@example.com",
    "to" => "recipient@example.com",
    "subject" => "Hello World",
    "content-type" => "text/plain"
  },
  content_type: %{
    type: "text",
    subtype: "plain",
    params: %{"charset" => "us-ascii"}
  },
  encoding: "7bit",
  body: "This is the message body.",
  parts: nil,
  filename: nil
}

Fields

FieldTypeDescription
headersmapAll headers as lowercase keys. Multiple headers with the same name are stored as a list.
content_typemapParsed Content-Type with type, subtype, and params. Defaults to text/plain.
encodingstringContent-Transfer-Encoding. Defaults to "7bit".
bodystring | nilDecoded message body for non-multipart messages.
partslist | nilList of %Mailex.Message{} structs for multipart messages.
filenamestring | nilFilename from Content-Disposition or Content-Type name parameter.

Features

  • Header parsing with folding (continuation lines)
  • Multiple headers with same name (e.g., Received) stored as lists
  • Content-Type parsing with parameters (boundary, charset, name)
  • Multipart message handling with recursive part parsing
  • Nested message/rfc822 support
  • multipart/digest with correct default content-type (message/rfc822)
  • base64 and quoted-printable decoding
  • RFC 2047 encoded-word decoding in filenames
  • Mbox format "From " line handling
  • CRLF and LF line ending normalization

Character encodings

Mailex decodes text headers and bodies to UTF-8. UTF-8 and US-ASCII work with no configuration. Legacy charsets (iso-8859-* and windows-125x) are transcoded with codepagex, which only compiles the ISO-8859 family by default. To handle the Windows codepages, list the encodings you need in your config and recompile codepagex:

# config/config.exs
config :codepagex, :encodings, [
  "ISO8859/8859-1",
  "ISO8859/8859-15",
  "VENDORS/MICSFT/WINDOWS/CP1252"
  # ...and any others you need
]
mix deps.compile codepagex --force

Setting :encodings replaces codepagex's default, so re-list any ISO-8859 encodings you rely on. Charsets you don't configure are left as-is. See the codepagex docs for the full list of encoding names.

Examples

Multipart message

{:ok, message} = Mailex.parse(multipart_email)

message.content_type.type
#=> "multipart"

message.content_type.subtype
#=> "mixed"

message.content_type.params["boundary"]
#=> "----=_Part_0"

length(message.parts)
#=> 3

# Access first part
first_part = hd(message.parts)
first_part.content_type.type
#=> "text"
first_part.body
#=> "Hello, this is the message text."

Attachments

{:ok, message} = Mailex.parse(email_with_attachment)

attachment = Enum.find(message.parts, & &1.filename)
attachment.filename
#=> "document.pdf"

attachment.content_type
#=> %{type: "application", subtype: "pdf", params: %{}}

# Body is already decoded from base64
byte_size(attachment.body)
#=> 12345

Multiple headers

{:ok, message} = Mailex.parse(email_with_multiple_received)

message.headers["received"]
#=> ["from server1.example.com", "from server2.example.com"]

License

MIT