Credence.Pattern.PreferGraphemesForCharacterUniqueness
(credence v0.8.0)
Copy Markdown
Readability & correctness rule: Detects the pattern
String.to_charlist(s) |> Enum.uniq() |> Enum.count() |> (&(&1 == String.length(s))).().
String.to_charlist/1 decomposes into codepoints (integers), while
String.length/1 counts graphemes (whole characters). For decomposed
Unicode (e.g. "é" = "e" + U+0301) these differ — the codepoint count is
higher than the grapheme count — so the comparison is wrong. Using
String.graphemes/1 instead keeps both sides grapheme-level.
Under the single_codepoint_graphemes assumption (every grapheme is exactly
one codepoint), the two decompositions agree, so the rewrite is safe.
The fix also modernises the capture call idiom from (&expr).() to
then(&expr), which reads more naturally in a pipeline.
Bad (only rewritten while single_codepoint_graphemes is on)
String.to_charlist(s) |> Enum.uniq() |> Enum.count() |> (&(&1 == String.length(s))).()Good
String.graphemes(s) |> Enum.uniq() |> Enum.count() |> then(&(&1 == String.length(s)))