#995 – Unicode Code Points

December 13, 2013 8 Comments

Unicode maps characters into their corresponding code points, i.e. a numeric value that represents that character. The full set of possible Unicode values ranges from 0 to 10FFFF (hex), for a total of 1,114,112 possible values.

The full set of possible Unicode values is divided into 17 sets of 65,536 values, known as planes. Each plane is identified numerically by the first 5 bits of the code point. For example:

Plane 0: 000000 – 00FFFF
Plane 1: 010000 – 01FFFF
etc.
Plane 16: 100000 – 10FFFF

Plane 0, also known as the Basic Multilingual Plane, contains code points representing characters in most of the world’s alphabets. This include Latin alphabets, as well as Asian scripts like CJK, Katakana and Hiragana.

Other planes contain less common characters. For example, plane 1, known as the Supplementary Multilingual Plane (SMP) contains characters from ancient alphabets, hieroglyphics, and musical symbols.

The Unicode standard has currently only defined a mapping for about 10% of all of the possible code point values.

Filed under Basics Tagged with Basics, C#, Code Points, Unicode

About Sean
Software developer in the Twin Cities area, passionate about software development and sailing.

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

2,000 Things You Should Know About C#

#995 – Unicode Code Points

8 Responses to #995 – Unicode Code Points

Leave a comment Cancel reply

Sean Sexton

Recent Posts

Blogroll

Calendar

Top Posts

Tags

Blog Stats

2,000 Things You Should Know About C#

#995 – Unicode Code Points

Share this:

Related

8 Responses to #995 – Unicode Code Points

Leave a comment Cancel reply

Sean Sexton

Recent Posts

Blogroll

Calendar

Top Posts

Tags

Blog Stats