View Source tflite_beam_wordpiece_tokenizer (tflite_beam v0.3.8)
Runs WordPiece tokenziation.
Summary
Functions
Tokenizes a piece of text into its word pieces.
Functions
Tokenizes a piece of text into its word pieces.
This uses a greedy longest-match-first algorithm to perform tokenization using the given vocabulary.
For example:
Input = "unaffable".
Output = ["una", "##ffa", "##ble"].
Input = "unaffableX".
Output = ["[UNK]"].
Related link: https://github.com/tensorflow/examples/blob/master/lite/examples/bert_qa/ios/BertQACore/Models/Tokenizers/WordpieceTokenizer.swift