Parameter passing in C#
Posted by
Jon Skeet
on
Thursday, February 02, 2006 (EST)
Many
people have become fairly confused about how parameters are passed in
C#, particularly with regard to reference types. This page should help
to clear up some of that confusion
This article has been viewed:
2,706
times
Technology:
C# Language.
|
Contents
Many people have become fairly
confused about how parameters are passed in C#, particularly with
regard to reference types. This page should help to clear up some of
that confusion. If you have any suggestions for how it can be made
clearer, please mail me. [^]
Microsoft also has a good page [^] about this topic (which I believe uses exactly the same terminology as this page - let me know if they appear to disagree).
Top
In .NET (and therefore C#) there are two main sorts of type: reference types and value types.
They act differently, and a lot of confusion about parameter passing is
really down to people not properly understanding the difference between
them. Here's a quick explanation:
A reference type is a type which has as its value a reference to the appropriate data rather than the data itself. For instance, consider the following code:
StringBuilder sb = new StringBuilder();
(I have used StringBuilder
as a random example of a reference type - there's nothing special about it.) Here, we declare a variable sb
, create a new StringBuilder
object, and assign to sb
a reference to the object. The value of sb
is not
the object itself, it's the reference. Assignment involving reference
types is simple - the value which is assigned is the value of the
expression/variable - i.e. the reference. This is demonstrated further
in this example:
StringBuilder first = new StringBuilder();
StringBuilder second = first;
Here we declare a variable first
, create a new StringBuilder
object, and assign to first
a reference to the object. We then assign to second
the value of first
.
This means that they both refer to the same object. They are still,
however, independent variables themselves. Changing the value of first
will not change the value of second
- although while their values are still references to the same object, any changes made to the object through the first
variable will be visible through the second
variable. Here's a demonstration of that:
StringBuilder first = new StringBuilder();
StringBuilder second = first;
first.Append ("hello");
first = null;
Console.WriteLine (second);
hello
Here, we declare a variable first
, create a new StringBuilder
object, and assign to first
a reference to the object. We then assign to second
the value of first
. We then call the Append
method on this object via the reference held in the first
variable. After this, we set the first
variable to null
(a value which doesn't refer to any object). Finally, we print out the results of calling the ToString
method on the StringBuilder object via the reference held in the second
variable. hello
is displayed, demonstrating that even though the value of first
has changed, the data within the object it used to refer to hasn't - and second
still refers to that object.
Class types, interface types, delegate types and array types are all reference types.
Top
While reference types have a layer of indirection between the variable and the real data, value types
don't. Variables of a value type directly contain the data. Assignment
of a value type involves the actual data being copied. Take a simple
struct, for example:
public struct IntHolder
{
public int i;
}
Wherever there is a variable of type IntHolder
,
the value of that variable contains all the data - in this case, the
single integer value. An assignment copies the value, as demonstrated
here:
IntHolder first = new IntHolder();
first.i=5;
IntHolder second = first;
first.i=6;
Console.WriteLine (second.i);
Output:
5
Here, second.i
has the value 5, because that's the value first.i
has when the assignment second=first
occurs - the values in second
are independent of the values in first
apart from when the assignment takes place.
Simple types (such as float
, int
, char
), enum types and struct types are all value types.
Note that many types (such as string
appear in some ways to be value types, but in fact are reference types. These are known as immutable types. This means that once an instance has been constructed, it can't be changed. This allows a reference type to act similarly
to a value type in some ways - in particular, if you hold a reference
to an immutable object, you can feel comfortable in returning it from a
method or passing it to another method, safe in the knowledge that it
won't be changed behind your back. This is why, for instance, the string.Replace
doesn't change the string it is called on, but returns a new instance
with the new string data in - if the original string were changed, any
other variables holding a reference to the string would see the change,
which is very rarely what is desired.
Constrast this with a mutable (changeable) type such as ArrayList
- if a method returns the ArrayList
reference stored in an instance variable, the calling code could then
add items to the list without the instance having any say about it,
which is usually a problem. Having said that immutable reference types
act like value types, they are not value types, and shouldn't be thought of as actually being value types.
For more information about value types, reference types, and where the data for each is stored in memory, please see my other article about the subject [^].
Top
What would you expect to see from the code above if the declaration of the IntHolder
type was as a class instead of a struct? If you don't understand why the output would be 6
, please re-read both preambles and mail me [^] if it's still not clear - if you don't get it, it's my fault, not yours, and I need to improve this page. If you do understand it, parameter passing becomes very easy to understand - read on.
Top
There are four different kinds of parameters in C#: value parameters (the default), reference parameters (which use the ref
modifier), output parameters (which use the out
modifier), and parameter arrays (which use the params
modifier). You can use any of them with both value and reference types.
When you hear the words "reference" or "value" used (or use them
yourself) you should be very clear in your own mind whether you mean
that a parameter is a reference or value parameter, or whether you mean
that the type involved is a reference or value type. If you can keep
the two ideas separated, they're very simple.
Top
By default, parameters are value
parameters. This means that a new storage location is created for the
variable in the function member declaration, and it starts off with the
value that you specify in the function member invocation. If you change
that value, that doesn't alter any variables involved in the
invocation. For instance, if we have:
void Foo (StringBuilder x)
{
x = null;
}
...
StringBuilder y = new StringBuilder();
y.Append ("hello");
Foo (y);
Console.WriteLine (y==null);
Output:
False
The value of y
isn't changed just because x
is set to null
.
Remember though that the value of a reference type variable is the
reference - if two reference type variables refer to the same object,
then changes to the data in that object will be seen via both variables. For example:
void Foo (StringBuilder x)
{
x.Append (" world");
}
...
StringBuilder y = new StringBuilder();
y.Append ("hello");
Foo (y);
Console.WriteLine (y);
Output:
hello world
After calling Foo
, the StringBuilder object that y
refers to contains "hello world", as in Foo
the data " world" was appended to that object via the reference held in x
.
Now consider what happens when value
types are passed by value. As I said before, the value of a value type
variable is the data itself. Using the previous definition of the
struct IntHolder
, let's write some code similar to the above:
void Foo (IntHolder x)
{
x.i=10;
}
...
IntHolder y = new IntHolder();
y.i=5;
Foo (y);
Console.WriteLine (y.i);
Output:
5
When Foo
is called, x
starts off as a struct with value i=5
. Its i
value is then changed to 10. Foo
knows nothing about the variable y
, and after the method completes, the value in y
will be exactly the same as it was before (i.e. 5).
As we did earlier, check that you understand what would happen if IntHolder
was declared as a class instead of a struct. You should understand why y.i
would be 10 after calling Foo
in that case.
Top
Reference parameters don't pass the
values of the variables used in the function member invocation - they
use the variables themselves. Rather than creating a new storage
location for the variable in the function member declaration, the same
storage location is used, so the value of the variable in the function
member and the value of the reference parameter will always be the
same. Reference parameters need the ref
modifier as part
of both the declaration and the invocation - that means it's always
clear when you're passing something by reference. Let's look at our
previous examples, just changing the parameter to be a reference
parameter:
void Foo (ref StringBuilder x)
{
x = null;
}
...
StringBuilder y = new StringBuilder();
y.Append ("hello");
Foo (ref y);
Console.WriteLine (y==null);
Output:
True
Here, because a reference to y
is passed rather than its value, changes to the value of parameter x
are immediately reflected in y
. In the above example, y
ends up being null
. Compare this with the result of the same code without the ref
modifiers.
Now consider the struct code we had earlier, but using reference parameters:
void Foo (ref IntHolder x)
{
x.i=10;
}
...
IntHolder y = new IntHolder();
y.i=5;
Foo (ref y);
Console.WriteLine (y.i);
Output:
10
The two variables are sharing a storage location, so changes to x
are also visible through y
, so y.i
has the value 10 at the end of this code.
Top
You
may have noticed that the last example, passing a struct by reference,
had the same effect in this code as passing a class by value. This
doesn't mean that they're the same thing, however. Consider the
following code:
void Foo (??? IntHolder x)
{
x = new IntHolder();
}
...
IntHolder y = new IntHolder();
y.i=5;
Foo (??? y);
In the case where IntHolder
is a struct (i.e. a value type) and the parameter is a reference parameter (i.e. replace ???
with ref
above), y
ends up being a new IntHolder
value - i.e. y.i
is 0. In the case where IntHolder
is a class (i.e. a reference type) and the parameter is a value parameter (i.e. remove ???
above), the value of y
isn't changed - it's a reference to the same object it was before the function member call. This
difference is absolutely crucial to understanding parameter passing in
C#, and is why I believe it is highly confusing to say that objects are
passed by reference by default instead of the correct statement that
object references are passed by value by default.
Top
Like reference parameters, output
parameters don't create a new storage location, but use the storage
location of the variable specified on the invocation. Output parameters
need the out
modifier as part of both the declaration and
the invocation - that means it's always clear when you're passing
something as an output parameter.
Output parameters are very similar to reference parameters. The only differences are:
Here is some example code showing this, with an int
parameter (int
is a value type, but if you understood reference parameters properly,
you should be able to see what the behaviour for reference types is):
void Foo (out int x)
{
// Can't read x here - it's considered unassigned
// Assignment - this must happen before the method can complete normally
x = 10;
// The value of x can now be read:
int a = x;
}
...
// Declare a variable but don't assign a value to it
int y;
// Pass it in as an output parameter, even though its value is unassigned
Foo (out y);
// It's now assigned a value, so we can write it out:
Console.WriteLine (y);
Output:
10
Top
Parameter arrays allow a variable number
of arguments to be passed into a function member. The definition of the
parameter has to include the params
modifier, but the use
of the parameter has no such keyword. A parameter array has to come at
the end of the list of parameters, and must be a single-dimensional
array. When using the function member, any number of parameters
(including none) may appear in the invocation, so long as the
parameters are each compatible with the type of the parameter array.
Alternatively, a single array may be passed, in which case the
parameter acts just as a normal value parameter. For example:
void ShowNumbers (params int[] numbers)
{
foreach (int x in numbers)
{
Console.Write (x+" ");
}
Console.WriteLine();
}
...
int[] x = {1, 2, 3};
ShowNumbers (x);
ShowNumbers (4, 5);
Output:
1 2 3
4 5
In the first invocation, the variable x
is passed by value, as it's just an array. In the second invocation, a
new array of ints is created containing the two values specified, and a
reference to this array is passed.
Top
Some informal definitions and summaries of terms:
- Function member
- A function member is a method,
property, event, indexer, user-defined operator, instance constructor,
static constructor, or destructor.
- Output parameter
- A parameter very similar to a reference parameter, but with different definite assignment rules.
- Reference parameter (pass-by-reference semantics)
- A parameter which shares the
storage location of the variable used in the function member
invocation. As they share the same storage location, they always have
the same value (so changing the parameter value changes the invocation
variable value).
- Reference type
- Type where the value of a variable/expression of that type is a reference to an object rather than the object itself.
- Storage location
- A portion of memory holding the value of a variable.
- Value parameter (the default semantics, which are pass-by-value)
- A value parameter that has its
own storage location, and thus its own value. The initial value is the
value of the expression used in the function member invocation.
- Value type
- Type where the value of a variable/expression of that type is the object data itself.
- Variable
- Name associated with a storage
location and type. (Usually a single variable is associated with a
storage location. The exceptions are for reference and output
parameters.)
Top