seqfuzz v0.1.0 Seqfuzz View Source

Seqfuzz is an implementation of a sequential fuzzy string matching algorithm, similar to those used in code editors like Sublime Text. It is based on Forrest Smith's work on lib_ftps and his blog post Reverse Engineering Sublime Text's Fuzzy Match.

There is an alternate implementation by @WolfDan which can be found here: Fuzzy Match v0.2.0 Elixir.

Documentation

Installation

The package can be installed by adding seqfuzz to your list of dependencies in mix.exs:

def deps do
  [
    {:seqfuzz, "~> 0.1.0"}
  ]
end

Examples

iex> Seqfuzz.match("Hello, world!", "hellw")
%{match?: true, matches: [0, 1, 2, 3, 7], score: 187}

iex> items = [{1, "Hello Goodbye"}, {2, "Hell on Wheels"}, {3, "Hello, world!"}]
iex> Seqfuzz.filter(items, "hellw", &(elem(&1, 1)))
[{3, "Hello, world!"}, {2, "Hell on Wheels"}]

Scoring

Scores can be configured in your mix configuration. I have added additional separators as a default as well as two additional scoring features: case match bonus and string match bonus. Case match bonus provides a small bonus for matching case. String match bonus provides a large bonus when the pattern and the string match exactly (although with different cases) to make sure that those results are always highest.

The default scores and available settings are:

config :seqfuzz,
  sequential_bonus: 15,
  separator_bonus: 30,
  camel_bonus: 30,
  first_letter_bonus: 15,
  leading_letter_penalty: -3,
  max_leading_letter_penalty: -25,
  unmatched_letter_penalty: -1,
  case_match_bonus: 1,
  string_match_bonus: 20,
  separators: ["_", " ", ".", "/", ","],
  initial_score: 100

Changelog

  • 0.1.0 - Initial version. Supports basic algorithm but does not search recursively for better matches.

Roadmap

  • Add support for recursive search for better matches.
  • Add support for asynchronous stream search.

Link to this section Summary

Functions

Matches against a list of strings and returns the list of matches sorted by highest score first.

Matches against an enumerable using a callback to access the string to match and returns the list of matches sorted by highest score first.

Determines whether pattern is a sequential fuzzy match with string and provides a matching score. matches is a list of indices within string where a match was found.

Applies the match algorithm to the entire enumerable with options to sort and filter.

Link to this section Types

Specs

match_metadata() :: %{match?: boolean(), matches: [integer()], score: integer()}

Link to this section Functions

Link to this function

filter(enumerable, pattern)

View Source

Specs

filter(Enumerable.t(), String.t()) :: Enumerable.t()

Matches against a list of strings and returns the list of matches sorted by highest score first.

Examples

iex> strings = ["Hello Goodbye", "Hell on Wheels", "Hello, world!"]
iex> Seqfuzz.filter(strings, "hellw")
["Hello, world!", "Hell on Wheels"]
Link to this function

filter(enumerable, pattern, string_callback)

View Source

Specs

filter(Enumerable.t(), String.t(), (any() -> String.t())) :: Enumerable.t()

Matches against an enumerable using a callback to access the string to match and returns the list of matches sorted by highest score first.

Examples

iex> items = [{1, "Hello Goodbye"}, {2, "Hell on Wheels"}, {3, "Hello, world!"}]
iex> Seqfuzz.filter(items, "hellw", &(elem(&1, 1)))
[{3, "Hello, world!"}, {2, "Hell on Wheels"}]

Specs

match(String.t(), String.t()) :: match_metadata()

Determines whether pattern is a sequential fuzzy match with string and provides a matching score. matches is a list of indices within string where a match was found.

Examples

iex> Seqfuzz.match("Hello, world!", "hellw")
%{match?: true, matches: [0, 1, 2, 3, 7], score: 187}
Link to this function

matches(enumerable, pattern, string_callback, opts \\ [])

View Source

Specs

matches(Enumerable.t(), String.t(), (any() -> String.t()), keyword()) ::
  Enumerable.t() | [{any(), match_metadata()}]

Applies the match algorithm to the entire enumerable with options to sort and filter.

Options

  • :sort - Sort the enumerable by score, defaults to false.
  • :filter - Filter out elements that don't match, defaults to false.
  • :metadata - Include the match metadata map in the result, defaults to true. When true the return value is a tuple {element, %{...}}. When false, the return value is a list of element.

Examples

iex> strings = ["Hello Goodbye", "Hell on Wheels", "Hello, world!"]
iex> Seqfuzz.matches(strings, "hellw", & &1)
[
  {"Hello Goodbye", %{match?: false, matches: [0, 1, 2, 3], score: 155}},
  {"Hell on Wheels", %{match?: true, matches: [0, 1, 2, 3, 8], score: 185}},
  {"Hello, world!", %{match?: true, matches: [0, 1, 2, 3, 7], score: 187}}
]

iex> strings = ["Hello Goodbye", "Hell on Wheels", "Hello, world!"]
iex> Seqfuzz.matches(
iex>   strings,
iex>   "hellw",
iex>   & &1,
iex>   metadata: false,
iex>   filter: true,
iex>   sort: true
iex> )
["Hello, world!", "Hell on Wheels"]