String Immutability in NET

What exactly is meant by strings being immutable? It means that once you instantiate a String type, you cannot change its value. This is not saying that the following code will fail:

procedure TWinForm.Button1_Click(sender: System.Object; e: System.EventArgs); var s: System.String; begin s := 'Classic Cars'; s := s.ToUpper; s := s.ToLower; end;

It is saying that the preceding code requires two string instances (or memory allocations) even though they both are being referred to by the same variable. Consider the IL code generated for the preceding string assignments:

  • maxstack 1
  • locals (String V_0)

L_0000: ldstr "Classic Cars"

L_0005: stloc.0

L_0006: ldloc.0

L_0007: call String.ToUpper

L_000c: stloc.0

L_000d: ldloc.0

L_000e: call String.ToLower

L_0013: stloc.0 L_0014: ret

Look particularly at the two calls to string.ToUpper and string.ToLower. The implementation of these two functions ultimately will allocate memory for the strings on which they will operate. This operation is one that actually performs a string allocation. This is equivalent to invoking the newobj IL instruction, which creates a type instance. The point being made is that you cannot assign or change the value of a string. The code might appear as though a string variable has been modified. In reality, a second string is being allocated. When s is assigned the result from ToUpper, it refers to a new string instance in memory. The original, referring to "Classic Cars", is now available for the garbage collector to reclaim.

Operations such as the following show that performing various operations do not change the original string:

s := 'Xavier Pacheco';

  1. WriteLine(s.ToUpper);
  2. WriteLine(s);

The output would be

XAVIER PACHECO Xavier Pacheco

Consider the following statement: s := 'Chmh';

s := s.Insert(4, 'ro').Replace('h', 'a').ToUpper; Figure 11.1 illustrates what's actually in memory.

S

FIGURE 11.1 Memory allocation with String operations.

FIGURE 11.1 Memory allocation with String operations.

First, memory is allocated for the string 'Chmh'. Next, the Inserto method causes another memory allocation for the string 'Chmhro1. The Repiace() method causes yet another memory allocation to hold the string 'Camaro1. Finally, one last memory allocation is made to hold the string 'camaro1 and the variable s is set to reference that final string. Four memory allocations were made in this operation.

Despite all the memory allocations, the CLR handles String management quite efficiently. For the most part, you shouldn't have to worry about the allocations as the garbage collector will take care of them. However, if you are doing a lot of string manipulation—perhaps in a loop where the results would be many allocations—you'll want to use the stringBuiider class, which is discussed momentarily. To illustrate the efficiency of the CLR, consider the following code:

s1, s2: String; begin si := 'delphi'; s2 := 'delphi';

Console.WriteLine(System.Object.ReferenceEquals(s1, s2)); Console.ReadLine;

The System.ReferenceEquals() function returns True if two variables refer to the same object instance. True is returned because of the way the CLR handles literal strings. This technique is called string interning.

Another benefit of string immutability is that strings are thread safe. Because strings cannot be modified, there are no thread-synchronization issues to deal with.

Now that you understand the nature of .NET Strings, it's time to examine the various string operations.

0 0

Post a comment

  • Receive news updates via email from this site