#994 – Unicode Basics

To store alphabetic characters or other written characters on a computer system, either in memory or on disk, we need to encode each character so that we can store some numeric value that represents the character.  The numeric values are, ultimately, just bit patterns–where each bit pattern represents some character.

Unicode is a standard that specifies methods for encoding all characters from the written languages of the world.  This includes the ability to encode more than 1 million unique characters.  Unicode is the standard used for all web-based traffic (HTML and XML) and for storing character data on most modern operating systems (e.g. Windows, OS X, Unix).

The Unicode standard defines a number of different character encodings.  The most common are:

  • UTF-8 – Variable number of bytes used, from 1-4 bytes.  English characters use only 1 byte.
  • UTF-16 – Uses 2 bytes for most common characters, 4 bytes for other characters.
Advertisements

About Sean
Software developer in the Twin Cities area, passionate about .NET technologies. Equally passionate about my own personal projects related to family history and preservation of family stories and photos.

7 Responses to #994 – Unicode Basics

  1. Pingback: Dew Drop – December 12, 2013 (#1682) | Morning Dew

  2. Pingback: #995 – Unicode Code Points | 2,000 Things You Should Know About C#

  3. Pingback: #996 – UTF-16 Encoding, Part I | 2,000 Things You Should Know About C#

  4. Pingback: #997 – UTF-16 Encoding, Part II | 2,000 Things You Should Know About C#

  5. Pingback: #998 – UTF-8 Encoding | 2,000 Things You Should Know About C#

  6. Pingback: #999 – Some Examples of UTF-16 and UTF-8 Encoding | 2,000 Things You Should Know About C#

  7. Pingback: #1,002 – Specifying Character Encoding when Writing to a File | 2,000 Things You Should Know About C#

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: