#68 – String Equality

The default behavior for reference type equality dictates that two variables are equal only if they point to the same object.  However, the System.String class (string) overrides this.  Two string variables are equal if the two strings pointed to are equal (i.e. they have the same value).

 string s1 = "Popeye";
 string s2 = Console.ReadLine();   // Enter "Popeye" here

 bool b = (s1 == s2);    // True because contents of strings are equal

So, although s1 and s2 point to two different string objects in memory, the equality operator returns true because the values of the two strings are equal.

Advertisements

#67 – Default Behavior for Reference Type Equality

For reference types, a default equality check normally checks to see if the references are pointing to the exact same object–rather than checking to see if the objects pointed to are equivalent.

Suppose we have a Person type that has a constructor that takes Name and Age values.

 public class Person
 {
     public string Name { get; set; }
     public uint Age { get; set; }

     public Person(string name, uint age)
     {
         Name = name;
         Age = age;
     }
  }

Now suppose that we create two instances of the Person object with the same values for Name and Age.  The code fragment below shows what happens when we check to see if the resulting objects are equal.

 Person p1 = new Person("Sean", 46);
 Person p2 = new Person("Sean", 46);

 bool b = (p1 == p2);    // False, because p1 and p2 point to different objects

#66 – Including Quotation Marks in Strings

There are a couple of different ways to include quotation marks in string literals.

The first is to use an escape sequence:

 Console.WriteLine("I like \"coding\" in C#");

If you use verbatim string literals, you can instead just use double quotation marks:

 Console.WriteLine(@"I like ""coding"" in C#");

In either case, the output is:  I like “coding” in C#

#65 – Verbatim String Literals

Because the backslash (\) is the first character in an escape sequence, you need to use the double-backslash sequence (\\) to embed actual backslashes in a string literal.

 string file = "C:\\MyDir\\Another Dir\\thefile.txt";

Because this can get a little hard to read, C# allows using the at sign (@) character to indicate a verbatim string literal–a string literal in which escape sequences should not be interpreted.

Using a verbatim string literal, we can write the earlier string without doubling the backslashes:

 string file = @"C:\MyDir\Another Dir\thefile.txt";

We can also use a verbatim string literal to split a string across multiple lines in the source code, rather than embedding the \n escape sequence:

string file = @"First line
Second line
Third line";

#64 – Escape Sequences in String Literals

C# allows embedding special (often non-printable) characters into a string literal using an escape sequence.  An escape sequence is a series of characters that begins with a backslash (\), followed by one or more letters or digits.

Here’s an example of embedding several newline characters into a string, so that it’s printed on three different lines.

 Console.Write("First line\nSecond line\nThird line\n");     // 3 lines

Full list of escape sequences in C#:

  • \a  –  Bell (alert)
  • \b  –  Backspace
  • \f  –  Formfeed
  • \n  –  New line
  • \r  –  Carriage return
  • \t  –  Horizontal tab
  • \v  –  Vertical tab
  • \’  –  Single quote
  • \”  –  Double quote
  • \\  –  Backslash
  • (Backslash followed by 0) – Null
  • \xhh  –  ASCII character in hex
  • \xhhhh  –  Unicode character in hex
  • \uhhhh – Unicode character  (4-byte)
  • \Uhhhhhhhh – Unicode surrogate pair (8-byte)

#63 – Use StringBuilder for More Efficient String Concatentation

Using the concatenation operator (+) for string concatenation is convenient, but can be inefficient because a new string is allocated for each concatenation.

Let’s say that we do a simple test, appending the string equivalents of the first 50,000 integers:

string s1 = "";
for (int i = 0; i < 50000; i++)
    s1 = s1 + i.ToString();

In one test environment, this loop took 30,430 milliseconds.

We can make this code more efficient by using the Append method of the StringBuilder class, which avoids allocating memory for the string on each iteration:

 StringBuilder sb = new StringBuilder("");
 for (int i = 0; i < 50000; i++)
     sb.Append(i.ToString());

In the same test environment as before, this version takes 6 milliseconds.

StringBuilder is definitely more efficient, but likely worth using only when you plan on doing a large number of string concatenations.  See also Concatenating Strings Efficiently.

#62 – String Concatenation

In C#, you can use the ‘+’ (plus) character to concatenate multiple strings together.

 string s1 = "C#";
 string s2 = "fun";
 string s3 = s1 + " is " + s2;   // "C# is fun"

You can use one or more concatenation operators in the same expression and you can use an expression containing the concatenation operator anywhere that a string value is expected.

You can also use String.Concat or String.Format to concatenate strings.

 // Concat method
 string s4 = String.Concat(new object[] { "The ", 3, " musketeers" });
 string s5 = String.Concat("This", "That");    // ThisThat

 // Use String.Format to concatenate
 string s6 = string.Format("{0}{1}{2}", s1, " is ", s2);