Correspondence table between Unicode codes and UTF-8 encoding

zhaozj2021-02-16  93

The Table Below Summarizes The Format of these DiffERENT OCTET TYPES.

The letter x indeicates bits available for encoding bits of the

Character Number.

Char. Number Range | UTF-8 OCTET SEQUENCE

(HEXADECIMAL) | (Binary)

------------------------------------------------- ----------------

0000 0000-0000 007F | 0xxxxxxx

0000 0080-0000 07FF | 110xxxxx 10xxxxxx

0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxxx // A /

0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

This is a correspondence table between Unicode codes and UTF-8 encodings. Chinese Unicode encoding ranges in 000000 0800-0000 FFFF. I only made these conversions only in the code.

转载请注明原文地址:https://www.9cbs.com/read-14083.html

New Post(0)