seqfuzz v0.2.0 Seqfuzz View Source
Seqfuzz is an implementation of a sequential fuzzy string matching algorithm, similar to those used in code editors like Sublime Text. It is based on Forrest Smith's work on lib_ftps and his blog post Reverse Engineering Sublime Text's Fuzzy Match.
There is an alternate implementation by @WolfDan which can be found here: Fuzzy Match v0.2.0 Elixir.
Documentation
- GitHub: https://github.com/negcx/seqfuzz
- Hexdocs: https://hexdocs.pm/seqfuzz
Installation
The package can be installed by adding seqfuzz
to your list of dependencies in mix.exs
:
def deps do
[
{:seqfuzz, "~> 0.2.0"}
]
end
Examples
iex> Seqfuzz.match("Hello, world!", "hellw")
%{match?: true, matches: [0, 1, 2, 3, 7], score: 202}
iex> items = [{1, "Hello Goodbye"}, {2, "Hell on Wheels"}, {3, "Hello, world!"}]
iex> Seqfuzz.filter(items, "hellw", &(elem(&1, 1)))
[{3, "Hello, world!"}, {2, "Hell on Wheels"}]
Scoring
Scores can be passed as options if you want to override the defaults. I have added additional separators as a default as well as two additional scoring features: case match bonus and string match bonus. Case match bonus provides a small bonus for matching case. String match bonus provides a large bonus when the pattern and the string match exactly (although with different cases) to make sure that those results are always highest.
Changelog
0.2.0
- Change scoring to be via options instead of configuration.0.1.0
- Initial version. Supports basic algorithm but does not search recursively for better matches.
Roadmap
- Add support for recursive search for better matches.
- Add support for asynchronous stream search.
Link to this section Summary
Functions
Matches against a list of strings and returns the list of matches sorted by highest score first.
Matches against an enumerable using a callback to access the string to match and returns the list of matches sorted by highest score first.
Determines whether pattern
is a sequential fuzzy match with string
and provides a matching score. matches
is a list of indices within string
where a match was found.
Applies the match
algorithm to the entire enumerable
with options to sort and filter.
Link to this section Types
Specs
Link to this section Functions
Specs
filter(Enumerable.t(), String.t()) :: Enumerable.t()
Matches against a list of strings and returns the list of matches sorted by highest score first.
Examples
iex> strings = ["Hello Goodbye", "Hell on Wheels", "Hello, world!"]
iex> Seqfuzz.filter(strings, "hellw")
["Hello, world!", "Hell on Wheels"]
Specs
filter(Enumerable.t(), String.t(), (any() -> String.t())) :: Enumerable.t()
Matches against an enumerable using a callback to access the string to match and returns the list of matches sorted by highest score first.
Examples
iex> items = [{1, "Hello Goodbye"}, {2, "Hell on Wheels"}, {3, "Hello, world!"}]
iex> Seqfuzz.filter(items, "hellw", &(elem(&1, 1)))
[{3, "Hello, world!"}, {2, "Hell on Wheels"}]
Specs
match(String.t(), String.t(), keyword()) :: match_metadata()
Determines whether pattern
is a sequential fuzzy match with string
and provides a matching score. matches
is a list of indices within string
where a match was found.
Examples
iex> Seqfuzz.match("Hello, world!", "hellw")
%{match?: true, matches: [0, 1, 2, 3, 7], score: 202}
Specs
matches(Enumerable.t(), String.t(), (any() -> String.t()), keyword()) :: Enumerable.t() | [{any(), match_metadata()}]
Applies the match
algorithm to the entire enumerable
with options to sort and filter.
Options
:sort
- Sort the enumerable by score, defaults tofalse
.:filter
- Filter out elements that don't match, defaults tofalse
.:metadata
- Include the match metadata map in the result, defaults totrue
. Whentrue
the return value is a tuple{element, %{...}}
. Whenfalse
, the return value is a list ofelement
.:sequential_bonus
Default: 15:separator_bonus
: Default: 30:camel_bonus
Default: 30:first_letter_bonus
Default: 15:leading_letter_penalty
Default: -3:max_leading_letter_penalty
Default: -25:unmatched_letter_penalty
Default: -1:case_match_bonus
Default: 1:string_match_bonus
Default: 20:separators
Default: ["_", " ", ".", "/", ","]:initial_score
Default: 100
Examples
iex> strings = ["Hello Goodbye", "Hell on Wheels", "Hello, world!"]
iex> Seqfuzz.matches(strings, "hellw", & &1)
[
{"Hello Goodbye", %{match?: false, matches: [0, 1, 2, 3], score: 170}},
{"Hell on Wheels", %{match?: true, matches: [0, 1, 2, 3, 8], score: 200}},
{"Hello, world!", %{match?: true, matches: [0, 1, 2, 3, 7], score: 202}}
]
iex> strings = ["Hello Goodbye", "Hell on Wheels", "Hello, world!"]
iex> Seqfuzz.matches(
iex> strings,
iex> "hellw",
iex> & &1,
iex> metadata: false,
iex> filter: true,
iex> sort: true
iex> )
["Hello, world!", "Hell on Wheels"]