View Source Grammar.Tokenizer (Grammar v0.4.0)
This module extracts the tokens from the input string.
It is driven by the parser to search for specific tokens only, when required.
Summary
Functions
Returns the current token found in the input, if any. Expected tokens are passed as a list of token prototypes.
Creates a new tokenizer for a given input.
Returns the current token found in the input, and consumes it. The expected token prototype is passed as second parameter.
Types
Functions
Returns the current token found in the input, if any. Expected tokens are passed as a list of token prototypes.
The token is not consumed, so two succesive calls to current_token/2
will return the same token.
Examples
iex> tokenizer = Grammar.Tokenizer.new("hello world")
%Grammar.Tokenizer{
input: "hello world",
current_line: 1,
current_column: 1,
drop_spaces?: true,
sub_byte_matching?: false
}
iex> {{"hello", {1, 1}}, _} = Grammar.Tokenizer.current_token(tokenizer, ["hello"])
iex> {{"hello", {1, 1}}, _} = Grammar.Tokenizer.current_token(tokenizer, ["hello"])
iex> {{nil, {1, 1}}, _} = Grammar.Tokenizer.current_token(tokenizer, ["world"])
Creates a new tokenizer for a given input.
Parameters
- input: bitstring from which token will be extracted.
- drop_spaces?: when set to
false
, the tokenizer will not drop spaces and newlines. - sub_byte_matching? : when set to
true
, the tokenizer will match tokens at the bit level.
When bit level matching is enabled, the tokenizer will not drop spaces and newlines (obviously) and errors are reported at line 1, and column set at the current bit index (starting at the begin of the input bitstring).
Examples
iex> Grammar.Tokenizer.new("this is the input")
%Grammar.Tokenizer{
input: "this is the input",
current_line: 1,
current_column: 1,
drop_spaces?: true,
sub_byte_matching?: false
}
iex> Grammar.Tokenizer.new("this is the input", false)
%Grammar.Tokenizer{
input: "this is the input",
current_line: 1,
current_column: 1,
drop_spaces?: false,
sub_byte_matching?: false
}
Returns the current token found in the input, and consumes it. The expected token prototype is passed as second parameter.
Examples
iex> tokenizer = Grammar.Tokenizer.new("hello world")
iex> {{"hello", {1, 1}}, tokenizer} = Grammar.Tokenizer.next_token(tokenizer, "hello")
iex> {{"world", {1, 7}}, _} = Grammar.Tokenizer.next_token(tokenizer, "world")