View Source Renewex.Tokenizer (renewex v0.4.0)
This module implements a simple tokenizer for splitting a string into lexems that are needed to parse Renew *.rnw
files.
Renew *.rnw
files are text files containing a human readable serialization of Renew Java Objects.
For reading such a file the text content must first be split into tokens/lexems that are similar to Java tokens. In the original Renew Java implementation the Java tokenizer is used but the Renew file format does not make use of all the tokens defined by the Java language. Hence this module defines only a sub set Java syntax tokens.
*.rnw
contain an object graph. Each node starts with the name of java class. This class name determines how the following tokens shall be
parsed. Primitive values like integer, float or String are the leaf nodes.
A *.rnw
may contain cyclic references. These are represented by REF <int>
tokens that represent a reference to a previously parsed object.
The <int> is an index into the array of already parsed objects. For example REF 5
points to the fifth object that has occured while parsing the file.
A REF <int>
token must not contain an integer that is larger that the number of already parsed objects, ie. no forward references are possible.
Summary
Functions
Convers a given string into a given type.
Takes a list of tokens and removes all tokens that are regarded as white space.
Get the list of token types defined by the tokenizer.
Functions
Convers a given string into a given type.
Takes a list of tokens and removes all tokens that are regarded as white space.
Get the list of token types defined by the tokenizer.
Returns
[:white,:float,:int,:boolean,:null,:ref,:class_name,:string] in order determined by the capture groups in the compiled regex