Ftfy.TextFixerConfig (ftfy v0.1.0)

Copy Markdown View Source

Configuration options for ftfy. Mirrors ftfy.TextFixerConfig.

Instantiate with the defaults and override what you need, e.g. %Ftfy.TextFixerConfig{uncurl_quotes: false}. The top-level functions also accept a keyword list, which is merged onto the appropriate default.

Options (defaults shown):

  • unescape_html: "auto" — replace HTML entities; "auto" disables this when a literal < appears, since the input is probably real HTML.
  • remove_terminal_escapes: true
  • fix_encoding: true — detect and fix mojibake by re-decoding. The next four options only matter when this is on:
    • restore_byte_a0: true
    • replace_lossy_sequences: true
    • decode_inconsistent_utf8: true
    • fix_c1_controls: true
  • fix_latin_ligatures: true
  • fix_character_width: true
  • uncurl_quotes: true
  • fix_line_breaks: true
  • fix_surrogates: true
  • remove_control_chars: true
  • normalization: "NFC" — one of "NFC", "NFD", "NFKC", "NFKD", or nil for no normalization.
  • max_decode_length: 1_000_000 — largest segment fixed at once.
  • explain: true — whether to compute explanations.

Summary

Types

t()

@type t() :: %Ftfy.TextFixerConfig{
  decode_inconsistent_utf8: term(),
  explain: term(),
  fix_c1_controls: term(),
  fix_character_width: term(),
  fix_encoding: term(),
  fix_latin_ligatures: term(),
  fix_line_breaks: term(),
  fix_surrogates: term(),
  max_decode_length: term(),
  normalization: term(),
  remove_control_chars: term(),
  remove_terminal_escapes: term(),
  replace_lossy_sequences: term(),
  restore_byte_a0: term(),
  uncurl_quotes: term(),
  unescape_html: term()
}