#1,000 – UTF-8 and ASCII
December 24, 2013 Leave a comment
In Unicode, code points for ASCII characters are equivalent to the ASCII code for that character.
This mapping is true for all 128 ASCII codes.
UTF-8 encoding maps these first 128 characters in the set of Unicode code points to a single byte containing the code point. Because of this:
Characters included in the ASCII character set that are present in a stream of UTF-8 encoded character data will appear the same as if they were encoded as ASCII.
This means that a UTF-8 encoded stream of ASCII characters will be identical to an ASCII-encoded stream of the same characters. I.e. For English language characters, UTF-8 is identical to ASCII.