Lua.VM.Stdlib.Utf8 (Lua v1.0.0-rc.2)
View SourceLua 5.3 utf8 standard library (§6.5).
Operates over byte strings; Lua strings have no Unicode awareness of their
own — this library treats the bytes as a UTF-8 encoded sequence and
validates per the BMP+supplementary range [0, 0x10FFFF]. Overlong
encodings (e.g. \xC0\x80 for U+0000), continuation bytes appearing
in the lead position, and codepoints above 0x10FFFF all surface as
"invalid UTF-8 code".
Functions
utf8.char(...)— codepoints to UTF-8 stringutf8.codepoint(s [, i [, j]])— UTF-8 string slice to codepointsutf8.codes(s)— stateless(byte_pos, codepoint)iteratorutf8.len(s [, i [, j]])— codepoint count, ornil, byte_poson the first invalid sequence in the sliceutf8.offset(s, n [, i])— byte position of the n-th codepointutf8.charpattern— Lua pattern matching one UTF-8 byte sequence