-
UTF-8
- Popular for HTML, XML, and similar protocols
- Variable width (one or more bytes)
- "ASCII-compatible": Unicode characters corresponding to the ASCII set have the same byte values as ASCII.
- Most commonly supported Unicode encoding
-
UTF-16
- For balance between efficient access to characters and economical use of storage
- Variable width: Commonly used characters fit into a single 16-bit code unit, while all other characters are accessible via pairs of 16-bit codes units.
-
UTF-32
- Popular where memory space is of little concern, but fixed width, single code unit access to characters is desired.
- Fixed width
- Each character is encoded in a single 32-bit code unit.
All three encoding forms need at most 4 bytes (or 32-bits) of data for each character.