#995 – Unicode Code Points

Unicode maps characters into their corresponding code points, i.e. a numeric value that represents that character.  The full set of possible Unicode values ranges from 0 to 10FFFF (hex), for a total of 1,114,112 possible values.

The full set of possible Unicode values is divided into 17 sets of 65,536 values, known as planes.  Each plane is identified numerically by the first 5 bits of the code point.  For example:

  • Plane 0: 000000 – 00FFFF
  • Plane 1: 010000 – 01FFFF
  • etc.
  • Plane 16: 100000 – 10FFFF

Plane 0, also known as the Basic Multilingual Plane, contains code points representing characters in most of the world’s alphabets.  This include Latin alphabets, as well as Asian scripts like CJK, Katakana and Hiragana.

Other planes contain less common characters.  For example, plane 1, known as the Supplementary Multilingual Plane (SMP) contains characters from ancient alphabets, hieroglyphics, and musical symbols.

The Unicode standard has currently only defined a mapping for about 10% of all of the possible code point values.

Advertisements

About Sean
Software developer in the Twin Cities area, passionate about .NET technologies. Equally passionate about my own personal projects related to family history and preservation of family stories and photos.

8 Responses to #995 – Unicode Code Points

  1. Pingback: #996 – UTF-16 Encoding, Part I | 2,000 Things You Should Know About C#

  2. Pingback: #997 – UTF-16 Encoding, Part II | 2,000 Things You Should Know About C#

  3. Pingback: #998 – UTF-8 Encoding | 2,000 Things You Should Know About C#

  4. Pingback: #999 – Some Examples of UTF-16 and UTF-8 Encoding | 2,000 Things You Should Know About C#

  5. Pingback: #1,000 – UTF-8 and ASCII | 2,000 Things You Should Know About C#

  6. Pingback: #1,001 – Representing Unicode Surrogate Pairs | 2,000 Things You Should Know About C#

  7. Pingback: #1,007 – Getting Length of String that Contains Surrogate Pairs | 2,000 Things You Should Know About C#

  8. Pingback: #1,076 – Implicit Numeric Conversions from the char Type | 2,000 Things You Should Know About C#

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: