View Source Unicode.Set.Operation (Unicode Set v1.2.0)

Functions to operate on Unicode sets:

  • Intersection
  • Difference
  • Union
  • Inversion

Link to this section Summary

Functions

Combines all the ranges into a single list

Compact overlapping and adjacent ranges

Returns the complement (inverse) of a set.

Removes one list of 2-tuples representing Unicode codepoints from another.

Expand takes a reduced AST and expands it into a single list of codepoint tuples.

Expand string ranges like {ab}-{cd}

Returns a boolean indicating whether the given AST includes set operations intersection or difference.

Returns the intersection of two lists of 2-tuples representing codepoint ranges.

Reduces all sets, properties and ranges to a list of 2-tuples expressing a range of codepoints.

Returns the difference of two lists of 2-tuples representing codepoint ranges.

Prewalks the expanded AST from a parsed Unicode Set invoking a function on each codepoint range in the set.

Merges two lists of 2-tuples representing ranges of codepoints. The result is a single list of 2-tuple codepoint ranges that includes all codepoint from the two lists.

Link to this section Functions

Combines all the ranges into a single list

This function is called iff the Unicode Sets are formed by unions only. If the set operations of intersection or difference are present then the ranges will need to be expanded via expand/1.

Compact overlapping and adjacent ranges

Returns the complement (inverse) of a set.

Removes one list of 2-tuples representing Unicode codepoints from another.

Returns the first list of codepoint ranges minus the codepoints in the second list.

Expand takes a reduced AST and expands it into a single list of codepoint tuples.

Link to this function

expand_string_range(arg1)

View Source
Link to this function

expand_string_ranges(ranges)

View Source

Expand string ranges like {ab}-{cd}

Link to this function

has_difference_or_intersection?(arg1)

View Source

Returns a boolean indicating whether the given AST includes set operations intersection or difference.

When these operations exist then all ranges - including ^ ranges needs to be expanded. If there are no intersections or differences then the ^ ranges can be directly translated to guard clauses or a list of elixir ranges.

Returns the intersection of two lists of 2-tuples representing codepoint ranges.

The result is a single list of codepoint ranges that represents the common codepoints in the two lists.

Reduces all sets, properties and ranges to a list of 2-tuples expressing a range of codepoints.

It can return one of two forms

[{:in, [tuple_list]}] for an inclusion list

[{:not_in, [tuple_list]}] for an exclusion list

or a combination of both.

Attempts are made to preserve :not_in clauses as long as possible since many uses, like regexes and nimble_parsec can consume :not_in style ranges.

When only single character classes are presented, or several classes which are unions, :not_in can be preserved.

When intersections and differences are required, the rnages must be both reduced and expanded in order for this set operations to complete.

Link to this function

symmetric_difference(this, that)

View Source

Returns the difference of two lists of 2-tuples representing codepoint ranges.

The result is a single list of codepoint ranges that represents the codepoints that are in either of the two lists but not both.

Prewalks the expanded AST from a parsed Unicode Set invoking a function on each codepoint range in the set.

Link to this function

traverse(range, var, fun)

View Source

Merges two lists of 2-tuples representing ranges of codepoints. The result is a single list of 2-tuple codepoint ranges that includes all codepoint from the two lists.

It is assumed that both lists are sorted prior to merging.