#99 – Use StringInfo to Get Specific Characters From A UTF32 String
September 24, 2010 Leave a comment
We saw that you cannot use the normal string index [] to get individual characters from a UTF32 string. Instead, you need to use the System.Globalization.StringInfo class.
In the example below, we first get a list of indexes to each of the three characters in our UTF32 string. We then extract index each character separately.
s = "A𠈓C"; int n = s.Length; // 4, because of 4-byte character in middle // Get locations of text elements int[] indexes = StringInfo.ParseCombiningCharacters(s); // 0, 1 and 3 // Retrieve single element string nextChar = StringInfo.GetNextTextElement(s, 0); // A nextChar = StringInfo.GetNextTextElement(s, 1); // 𠈓 nextChar = StringInfo.GetNextTextElement(s, 3); // C