#1,116 – Iterating Through a String Using the foreach Statement

You can iterate through the individual characters in a string using the foreach statement.  The iteration variable is of type char and is set consecutively to each character within the string as the foreach statement executes.

            string secret = "Kint is Keyser Soze";

            StringBuilder sbObfuscate = new StringBuilder();

            foreach (char c in secret)
                sbObfuscate.Append((char)(c + 3));

            Console.WriteLine(sbObfuscate);

1116-001

Advertisement

#1,011 – TryParse Indicates Whether a Parse Operation Will Succeed

You can use the Parse method to convert a string that represents a number to its equivalent numerical representation.

            string numberString = "108";
            int number = int.Parse(numberString);

If the string does not represent a value of the associated numeric type, or represents a numeric value that is outside the range of the type, a FormatException or OverflowException is generated.

            int n1 = int.Parse("3.4");    // FormatException

You can avoid the exception by using the TryParse method.  If the parse operation succeeds, the parsed value is stored in the output parameter and TryParse returns true.  If the parse operation does not succeed, the output parameter is not written to and TryParse returns false.  No exception is thrown.

            int n1;
            if (int.TryParse("3.4", out n1))
                Console.WriteLine("Parse worked--n1 contains number");
            else
                Console.WriteLine("Can't parse");

1011-001

#1,010 – Checking to See Whether a String is Null or Empty

An empty string and a null string are two different things.  In some cases, you’ll want to check to see whether a string is null or to see whether it is non-null yet empty.

            string bobsNickname = "Bubba";
            string sallysNickname = "";
            string joesNickname = null;

            // Check for empty
            if (sallysNickname == string.Empty)
                Console.WriteLine("No nickname for Sally");

            // Check for null
            if (joesNickname == null)
                Console.WriteLine("Joes nick is null");

Note that if we’d checked sallysNickname for null, the result would have been false.  Similarly, checking joesNickname for equality with string.Empty would also return false.

You can check for either null or empty in a single statement, using the IsNullOrEmpty method.

            if (string.IsNullOrEmpty(sallysNickname))
                Console.WriteLine("No nick that we can use");

#1,009 – A String Can Be Null or Empty

string variable can refer to a string that either contains characters, or an empty string.  Because string is a reference type, a variable of type string can also be assigned a value of null, indicating that the variable does not refer to any string.

            string bobsNickname = "Bubba";
            string sallysNickname = "";
            string joesNickname = null;

There is a difference between an empty string and a null string.  A variable that is set to an empty string refers to a location in memory containing a string that has no characters.  A string variable set to null refers to nothing.

You can choose to use empty strings in your code to mean one thing and nulls to mean something different.  In the code above, it might be the case that Sally explicitly does not have a nickname, whereas we just haven’t yet figured out what Joe’s nickname is.

#1,008 – What Happens When You Forget That Strings Are Immutable

Strings in C# (the System.String type) are immutable.  Functions that act upon a string never change the instance of the string, but instead return a new instance of a string.

For example, to replace a portion of a string, you call the Replace method, assigning the result to the original string (or to a new string).

            quote = quote.Replace("Hell", "Minnesota");

If you forget that a string is immutable, you may forget to assign the result of this call to something.  The compiler won’t warn you about this.

            string quote = "Go to Heaven for the climate, Hell for the company.";
            Console.WriteLine(quote);

            // Does NOT change quote.  Rather, it creates
            // a new string, which we don't store anywhere
            quote.Replace("Hell", "Minnesota");

            Console.WriteLine(quote);

The string is not changed, as we might have expected.
1008-001

#1,007 – Getting Length of String that Contains Surrogate Pairs

You can use the string.Length property to get the length (number of characters) of a string.  This only works, however, for Unicode code points that are no larger than U+FFFF.  This set of code points is known as the Basic Multilingual Plane (BMP).

Unicode code points outside of the BMP are represented in UTF-16 using 4 byte surrogate pairs, rather than using 2 bytes.

To correctly count the number of characters in a string that may contain code points higher than U+FFFF, you can use the StringInfo class (from System.Globalization).

            // 3 Latin (ASCII) characters
            string simple = "abc";

            // 3 character string where one character
            //  is a surrogate pair
            string containsSurrogatePair = "A𠈓C";

            // Length=3 (correct)
            Console.WriteLine(string.Format("Length 1 = {0}", simple.Length));

            // Length=4 (not quite correct)
            Console.WriteLine(string.Format("Length 2 = {0}", containsSurrogatePair.Length));

            // Better, reports Length=3
            StringInfo si = new StringInfo(containsSurrogatePair);
            Console.WriteLine(string.Format("Length 3 = {0}", si.LengthInTextElements));

1007-001

#1,006 – Getting the Length of a String

You can use the string.Length property to get an integer representing the length of a string.  In most cases, length means–the number of characters.

            // 3 Latin (ASCII) characters
            string simple = "abc";
            // 2 other characters: U+0100, E+4E01
            string other = "Ā丁";

            Console.WriteLine(string.Format("Length 1 = {0}", simple.Length));
            Console.WriteLine(string.Format("Length 2 = {0}", other.Length));

1006-001

 

It’s important to note that the Length property returns the number of individual Char objects that make up the string.  This can be different from the actual number of Unicode characters.

#1,005 – Replacing a Substring with a New Substring

You can use the string.Replace method to find and replace a substring of a longer string with a new substring.  Replace is an instance method that acts upon a specified string and returns a new string.

In the example below, we replace every occurrence of “race” with “class”, returning the new string.

            string quote =
@"There's a race of men that don't fit in,
A race that can't sit still;";

            string newQuote = quote.Replace("race", "class");

            Console.WriteLine(string.Format("ORIGINAL:\n{0}", quote));
            Console.WriteLine(string.Format("NEW:\n{0}", newQuote));

1005-001
You can use Replace to remove instances of a particular substring (replacing them with an empty string).

            string quote = "Four awesome score and seven awesome years ago";

            string newQuote = quote.Replace("awesome ", "");

            Console.WriteLine(string.Format("ORIGINAL:\n{0}", quote));
            Console.WriteLine(string.Format("NEW:\n{0}", newQuote));

1005-002

#1,004 – Converting a String to Uppercase or Lowercase

You can use the ToLower and ToUpper methods of the string type to convert a particular string to all lowercase or all uppercase.  In either case, a new string is returned.

            string peye = "Guy Noir";

            string peyeLower = peye.ToLower();
            string peyeUpper = peye.ToUpper();

            Console.WriteLine(peye);
            Console.WriteLine(peyeLower);
            Console.WriteLine(peyeUpper);

1004-001

#1,003 – Accessing Underlying Bytes in a String for Different Encodings

Strings in .NET are stored in memory as Unicode character data, using the UTF-16 encoding.  (2 bytes per character, or 4 bytes for surrogate pairs).

If you want to get access to the underlying data for the string in memory, you can use one of the functions listed below, indicating what encoding to use for the Unicode data when converting it to a byte array.  If you use Encoding.Unicode, you’ll get the data exactly as it is stored in memory for the String type.

  • System.Text.Encoding.Unicode.GetBytes – UTF-16
  • System.Text.Encoding.UTF8.GetBytes – UTF-8

In the example below, notice the different byte sequences used to encode the CJK character.

            string ideograph = "𠈓";
            byte[] utf16 = Encoding.Unicode.GetBytes(ideograph);
            byte[] utf8 = Encoding.UTF8.GetBytes(ideograph);

1003-001