CAUBLESTONE INK

.net development and other geeky stuff

Understanding Value and Reference Types

Posted on March 10th, 2004


Supporting Downloads
valuereferencetypes.doc (52kb)
valuereferencetypes.zip (11kb)

Purpose

The purpose of this document is to explain the usage of Value and Reference types within the dot net framework. Since each type behaves differently when used as a plain value and when used as a parameter to a method, it is important to understand what is actually happening. We will be covering how each type actually works within the framework when used by themselves and when used as parameters to a method.

Meet the Types

When dealing with the dot net framework you will come across many objects. Each object will belong to either the Value type or Reference type category. To understand how these two objects differ we need to look at how memory is allocated for our application.

There are two areas in memory that are used to store this data, the stack and the heap. If you are a C++ programmer you are probably familiar with these memory units. In the dot net framework all Value types are pushed to the stack where as Reference types are pushed to the heap while a pointer to this memory usage is placed on the stack. Let’s break it down. When you create a Value type in your application it asks the framework to allocate memory based on the size of the type to hold whatever data you need. This data is then placed on the stack for usage. Once data goes onto the stack it stays on the stack until it is removed by our application. This can happen when a variable goes out of scope. Also, if you assign a Value type to another Value type a copy is made of this data and placed on the stack so, even though they are both the same thing the variables are accessing the data from two very different areas of memory. Let’s look at an example:

VB.Net

Public Sub TestValueType()
  Dim x As Integer
  Dim y As Integer

  x = 5
  y = x + 5

  Console.WriteLine(x)
  Console.WriteLine(y)
End Sub

C#

public void TestValueType()
{
  int x, y;

  x = 5;
  y = x + 5;

  Console.WriteLine(x);
  Console.WriteLine(y);
}

In our example we created two variables x and y, which we told the framework were Integers (which is a value type, see the chart at the end of this document for a list of common Value types.). Next we assign the variable x to the value of 5, this tells the framework to place on the stack a value of 5 and assign it to the variable x. Next, we assign the value of y to the value of x. This has just told the framework to create a value of 5 on the stack and assign it to the variable y. Notice, that it did not set the value on the stack to be x but rather the contents of the variable x. This is important because we are dealing with value types and the stack, we have created an exact duplicate of the value stored in the variable x and it did not just point the variable y to the value of x, but instead copied it’s data. Next, we assign the value of y to be equal to its current value plus 5. This will of course make the y variable equal to the value of y plus 5. So, our variable y is now 10. Now, how did this effect our value of x, it did not. Since the variables we are using are value types the data was copied rather than referenced. Later I will show how this can be very beneficial. The key to understand is that any object that is placed on the stack is always copied or modified directly it is never referenced.

Reference types however are placed on the heap. The heap is used in a very different manner than the stack. The heap is a dynamic memory location where objects that may exist at any point in time are created, destroyed, used, etc. in any order, there is no logical flow to the creating or removing of elements on the heap. When working in the dot net framework it is important to understand the heap is managed by the Garbage Collection services in the dot net framework. This means that it is possible and highly likely that an object that goes out of scope in your application is still in memory and can be accessed. This is where you get memory leaks and the nice blue screen of death (when they are not handled properly). When you create a reference type the framework tells the computer to create an area of memory on the heap (wherever it can) and then assigns our objects data to this memory location. Now for our object to be able to have access to this data the framework also creates a location on the stack in which it creates a pointer to the memory address at which our object resides on the heap. Now if you remember earlier we stated that when you create a copy of a value type it actually copies the data from the stack to another area of the stack, almost like a photocopier makes exact copies yet they are in themselves separate objects in that you can highlight data on one page and throw another away and whatever you do only effects the page you are dealing with. However with a reference type when you copy the object you are coping the memory pointer not the actual data itself. What does this mean, it means that whatever you do to one you do to the other. Let’s look at an example:

VB.Net

Public Sub TestReferenceType()
  Dim f1 As New System.Windows.Forms.Form()
  Dim f2 As New System.Windows.Forms.Form()

  f1.Text = "This is form1"

  Console.WriteLine(f1.Text)
  f2 = f1
  f2.Text = "This is form2"

  Console.WriteLine(f1.Text)
End Sub

C#

public void TestReferenceType()
{
  System.Windows.Forms.Form f1, f2;

  f1 = new System.Windows.Forms.Form();
  f2 = new System.Windows.Forms.Form();

  f1.Text = "This is Form1";

  Console.WriteLine(f1.Text);
  f2 = f1;
  f2.Text = "This is Form2";

  Console.WriteLine(f2.Text);
}

What will the second writeline statement print? If you said This is Form2 then you are correct. Why did this happen? Earlier we stated that a reference types data exists on the heap and that a pointer to the memory reference exits on the stack, now when we copied our object into the f2 variable it copied the data that exits on the stack. In the reference type world this is also known as a shallow copy. The reason for this is that only the data that exists on the stack is copied such that you now have two objects in code that are pointing to and using the same information. Why would you want this?, you might ask. Well let’s say that you have an application that needs to modify a value by passing it to a method and return the result. If you can pass a reference to this object so that it will point to the original, you can modify it without having to use a lot of extra memory to do so. This will also help the performance of your application since you will not need to perform a complete copy and use twice as much data space in memory.

Using Types with Methods

We are now going to look at how types are used when passed to a method. As you may know there are two ways to pass a parameter to a method, by value, and by reference. If you remember from earlier how Value and Reference types are different when assigning them to different values then you can probably guess how the data is passed to the methods. For those of you who are coming from prior versions of VB you probably remember that by default all parameters were passed by reference. In the dot net framework by default everything is passed by value. Which in most cases is what we intend to begin with.

Now the difficulty is now remembering how the types interact with the application when they are passed by value or by reference. Let’s first look at how to pass a Value type by value and by reference.

VB.Net

Public Sub PassingValues()
  Dim x As Integer
  Dim y As Integer

  x = 5
  y = x + 5

  PassValueTypes(x, y)

  Console.WriteLine(x)
  Console.WriteLine(y)
End Sub

Public Sub PassValueTypes(ByVal xval As Integer, _ ByRef yval As Integer)
  xval += 10
  yval += 20
End Sub

C#

public void PassingValues()
{
  int x,y;

  x = 5;
  y = x + 5;

  PassValueTypes(x,ref y);

  Console.WriteLine(x);
  Console.WriteLine(y);
}

public void PassValueTypes(int xval, ref int yval)
{
  xval += 10;
  yval += 20;
}

Now lets find out what happened. Can you guess what is going to be written to the console? If you guessed 5 for x and 30 for y then you are correct. However you might be asking how since you added 10 to the x value as well. When you pass a Value type by value to a method the framework creates a copy of the data and assigns it to the variable that is defined in the function and places that data into a new location on the stack. Now, when you pass a Value type by reference something different takes place. While the data is still located on the stack a pointer is created, again on the stack, that tells the framework where the data is for the value in question. In this way you have created a way for the method to access directly the data contained in the passed Value type variable. When this happened you now have the ability to alter the data that exists in that variable and since you are accessing it directly the calling variable will be able to see those changes. Once the calling method is finished it clears the two variables off of the stack, the by value variable is destroyed while the by reference variable only has the pointer destroyed.

So, when you pass a Value type to a method by value it is just like creating another Value type and assigning it the data, which creates a copy of the other variable on the stack. However when you pass a Value type by reference it behaves more like a Reference type in that the framework creates a pointer instead of a copy of the data so that the method will know where the data is on the stack, by doing this the method can modify the value of the variable directly.

Reference types however behave much differently. Lets look at some code.

VB.Net

Public Sub PassingReferences()
  Dim f1 As New System.Windows.Forms.Form()
  Dim f2 As New System.Windows.Forms.Form()

  f1.Text = "This is Form1"
  f2.Text = "This is Form2"

  PassRefTypes(f1, f2)

  Console.WriteLine(f1.Text)
  Console.WriteLine(f2.Text)
End Sub

Public Sub PassRefTypes(ByVal frm1 As System.Windows.Forms.Form, ByRef frm2 As System.Windows.Forms.Form)
  frm1.Text = "We have modified form1"
  frm2.Text = "We have modified form2"
End Sub

C#

public void PassingReferences()
{
  System.Windows.Forms.Form f1, f2;

  f1 = new System.Windows.Forms.Form();
  f2 = new System.Windows.Forms.Form();

  f1.Text = "This is Form1";
  f2.Text = "This is Form2";

  PassReferenceTypes(f1,ref f2);

  Console.WriteLine(f1.Text);
  Console.WriteLine(f2.Text);
}

public void PassReferenceTypes(System.Windows.Forms.Form frm1, ref System.Windows.Forms.Form frm2)
{
  frm1.Text = "We have modified Form1";
  frm2.Text = "We have modified Form2";
}

Can you guess what is going to happen here? If you guessed that the information written to the console will be what is in the PassReferenceTypes function then you are correct. Let’s see why.

As you know when you create a Reference type the data for that type is created on the heap with a pointer placed on the stack. When you copy a Reference type, its pointer is copied into the new variable and placed on the stack. The data it contains on the heap is not replicated, so in essence they both point to the same data. Now here is the gotcha, when you pass a Reference type by value it works the same as when you pass it by reference. Why is this, you might ask. Well when you pass a variable by value what happens? The framework makes a copy of what is on the stack and sends it to the method. Now what does a Reference type have on the stack, a pointer to the heap. So, if we pass a Reference type by value it is going to copy its stack value which in this case is the pointer. This is very important to understand because when you pass a reference type to any method whether it is by value or by reference it will always create a new variable on the stack that contains a pointer to data that is managed on the heap. Thus, you need to be careful when passing a reference type and be sure of what your intending when doing it.

Value Type(s)

Below is a list of the common Value types that are defined in the dot net framework. This list was pulled from the MSDN Library.

C#
enum, struct, bool, byte, sbyte, char, decimal, double, float, int, uint, long, ulong, short, ushort, and string.

VB.Net
Enum, Struct, Boolean, Byte, Char, Date, Decimal, Double, Integer, Long, Short, Single, and String.

Conclusion

I hope that this document will help alleviate some of the confusion around Value and Reference types, and how they are used and managed within the dot net framework. It is also my hope that this document will help eliminate more questions than it answers. So, when venturing into the world of dot net be careful of the types that you use.