Ftfy.Codecs (ftfy v0.1.0)

Copy Markdown View Source

Encoding and decoding between Elixir strings (UTF-8 binaries) and raw byte sequences, for the encodings ftfy needs.

This stands in for Python's str.encode / bytes.decode together with the custom codecs registered by ftfy.bad_codecs:

  • single-byte charmap encodings (latin-1, sloppy-windows-*, iso-8859-2, macroman, cp437, non-sloppy windows-1252)
  • the utf-8-variants codec (CESU-8 / Java modified UTF-8)
  • standard utf-8

Functions return {:ok, result} / {:error, reason} rather than raising, so callers can mirror Python's try/except UnicodeDecodeError with case.

Summary

Functions

The single-byte charmap encodings this module knows how to handle.

Decode a raw byte binary into a string, using encoding.

Encode a string into raw bytes, using encoding.

Functions

charmap_encodings()

The single-byte charmap encodings this module knows how to handle.

decode(bytes, encoding)

@spec decode(binary(), String.t()) :: {:ok, binary()} | {:error, :invalid}

Decode a raw byte binary into a string, using encoding.

Returns {:ok, string} or {:error, :invalid}.

encode(string, encoding)

@spec encode(binary(), String.t()) :: {:ok, binary()} | {:error, :unencodable}

Encode a string into raw bytes, using encoding.

Returns {:ok, bytes} or {:error, :unencodable}.