星期日 六月 03, 2007
星期日 六月 03, 2007
Ienup Sung has posted one blog(http://blogs.sun.com/is/entry/secure_utf_8) about Secure UTF-8 sequence, very useful. And below table is copied from his blog directly.
| Unicode
Scalar Values in Binary |
Hex
Min |
Hex
Max |
1st Byte | 2nd Byte | 3rd Byte | 4th Byte |
|---|---|---|---|---|---|---|
| 00000000 00000000 0xxxxxxx | U+0000 | U+007F | 00..7F | |||
| 00000000 00000yyy yyxxxxxx | U+0080 | U+07FF | C2..DF | 80..BF |
||
| 00000000 zzzzyyyy yyxxxxxx | U+0800 | U+0FFF | E0 | A0..BF | 80..BF | |
| U+1000 |
U+CFFF |
E1..EC |
80..BF | 80..BF | ||
| U+D000 |
U+D7FF |
ED |
80..9F | 80..BF | ||
| U+D800 |
U+DFFF |
ill-formed |
||||
| U+E000 |
U+FFFF |
EE..EF |
80..BF | 80..BF | ||
| 000uuuuu zzzzyyyy yyxxxxxx | U+10000 | U+3FFFF | F0 | 90..BF | 80..BF | 80..BF |
| U+40000 |
U+FFFFF |
F1..F3 |
80..BF | 80..BF | 80..BF | |
| U+100000 |
U+10FFFF |
F4 |
80..8F | 80..BF | 80..BF | |