fmt_utf8(3) | Library Functions Manual | fmt_utf8(3) |
fmt_utf8 - encode 31-bit unsigned integer using UTF-8 rules
#include <fmt.h>
size_t fmt_utf8(char *dest,uint32_t source);
fmt_utf8 encodes a 31-bit unsigned integer using the UTF-8 rules. This can take from 1 byte (0-0x7f) up to 5 bytes (0x4000000-0x7fffffff). Values larger than 0x7fffffff cannot be represented in this encoding.
If dest equals FMT_LEN (i.e. is NULL), fmt_utf8 returns the number of bytes it would have written.
For convenience, fmt.h defines the integer FMT_UTF8 to be big enough to contain every possible fmt_utf8 output.
fmt_utf8 and scan_utf8 implement the encoding from UTF-8, but are meant to be able to store integers, not just Unicode code points. Values larger than 0x10ffff are not valid UTF-8 (see RFC 3629) but can be represented in the encoding, so fmt_utf8 will allow them.