Visitor Design Pattern: revisited for .NET
This is a review of the Visitor Design Pattern, in the light of .NET and C#, as well as how it can be used and expanded in this platform.
Classes and collections of classes are often used in OO programming. The Visitor pattern allows the developer to define new operations on the elements without changing their interface. We are analyzing the pattern as originally exposed in the GoF book [1].
GoF Review
Let's first have a look at the design pattern in general.Motivation
Consider the following structure:This is a very usual object structure.
Customers
, Orders
and Items
represent collection objects, and Customer
, Order
and Item
are the leaf components. Suppose we want to perform a common operation on each element, such as BuildReport
. Of course each element would generate different information for its own report. We would have to add a BuildReport
method to each element in the tree and then traverse the tree, calling that method on each element.Wait, you might be thinking: why not to centralize the
BuildReport
method in another class, make it general enough as to work with any collection type, and just iterate the elements and build the report there? Well, that would require a lengthy method with a big if..else
statements block to output different information for each element type:
public string BuildReport(ICollection collection)
{
StringWriter report = new StringWriter();
foreach (object item in collection)
{
if (item is Customer)
{
Customer customer = item as Customer;
report.WriteLine(customer.FirstName + ", " + customer.LastName);
report.WriteLine(BuildReport(customer.Orders));
}
else if (item is Order)
{
Order order = item as Order;
report.WriteLine(order.Date + " - " + order.Amount);
report.WriteLine(BuildReport(order.Items));
}
else if (item is Item)
{
Item orderitem= item as Item;
report.WriteLine(orderitem.Code + " - " + orderitem.Quantity);
}
}
report.ToString();
}
And we are building a very simple report with a quite simple hierarchy. If this were the system's main reporting class, we would have to be prepared to make an
else if
block for each type of object in the system. And what if the passed collection contains collections in turn? Things get very intricate really fast and we end up with spaghetti code.Adding the
BuildReport
method to the element itself simplifies this task, and puts the logic in a place where it is easy to find and maintain. So we decide to add the method to every class in the hierarchy.Now suppose that we want to add another method to them, such as
DumpToFile
, which would save the object's information to disk. We would have to add the new method to all the classes in the tree, again.The conclusion: whenever we want to perform new operations on an existing hierarchy of objects, we need to change all the elements' classes, and recompile everything.
Solution
The solution is to separate the methods from the elements and add them to a separate interface, which is called theVisitor
interface. This interface contains a method to process each type of element in the hierarchy, and is implemented by all the visitors, i.e. a BuildReportVisitor
, a DumpToFileVisitor
, etc.:
public interface IVisitor
{
void VisitCustomer(Customer customer);
void VisitOrder(Order order);
void VisitItem(Item item);
}
Visit
alone. The Visit method for the collection elements (Customers
, Orders
and Items
) is a design decision, not a requirement.In the Node hierarchy (that is, the original object structure), we replace all the methods with a single one, which is abstracted in an interface to be implemented in all the nodes:
public interface IVisitable
{
void Accept(IVisitor visitor);
}
Customer
, the method implementation would be:public virtual void Accept(IVisitor visitor)
{
visitor.VisitCustomer(this);
_orders.Accept(visitor)
}
Customer
is calling Accept
on the Orders
object in turn. For a composite element such as
Customers
):
public virtual void Accept(IVisitor visitor)
{
visitor.VisitCustomers(this);
foreach (IVisitable visitable in this._children)
visitable.Accept(visitor);
}
VisitXXX
method. If we were using method overloading there would be no need for that.Now the classes interaction is:
Note that the visitor is responsible for accumulating results and returning them to the calling application. Now the code for building reports (in this example) is centralized in a single class. We can add behavior to classes in the hierarchy by creating additional visitor to pass to them.
This behavior is called a Double-Dispath pattern, because the method being executed at last depends both on the caller's type (the visitor) and the element's type (the visitable element). See the [3].
Consequences
- Adding new operations is easy
- Related operations are centralized in a single class
- Adding a new element is difficult, as it requires changes in all the existing visitors
- State can be accumulated by the visitor
- Visitors must implement the complete interface even if they don't use the concrete element (i.e. if BuildReportVisitor doesn't build report information for Orders).
For more information, see the GoF book [1].
Our .NET Implementation
We based our implementation on Jeremy Blosser's proposal [2].As we have seen, the visitor class can be implemented using two approaches:
- Use one method for each product type:
we use a method name according to the product type, for example,VisitCustomer
,VisitOrder
,VisitItem
, etc.
As a consequence, theIVisitor
interface, to be implemented by all the visitors, needs to have one method definition for each product type. This makes it difficult to extend the product tree, as changes to it requires changes in the interface and all the visitor's classes. - Use operator overloading:
we use only one method, and let the method overloading mechanism pick up the right method to use. We still have the problem that all the visitors must implement a method for each product type, to avoid runtime exceptions, even if they don't need the method at all.
Extending the pattern with Reflection
We will use the second approach (operator overloading) and simultaneously provide an implementation that avoids the limitations for descendent classes, that is, the requisite of implementing a method for each product type even when it is not used.Basically, we are going to provide a single method in the IVisitor interface:
public interface IVisitor
{
void Visit(object visitable);
}
public virtual void Visit(Customer customer)
{
//Process the customer element.
}
However, as the product receives an object of type
IVisitor
, when it calls its Accept
method passing itself as a parameter, he is actually calling the method that receives an object
as parameter, not the typed class type.The effect is that the method we have just seen, is never called, and the call is always dispatched to the
IVisitor.Visit
method implementation.To avoid this, we will provide an abstract implementation of the IVisitor interface which will use reflection to overcome this limitation. Here is the complete class code, which we will analyze in detail:
public abstract class Visitor : IVisitor
{
private MethodInfo _lastmethod = null;
private object _lastvisitable = null;
public void Visit(object visitable)
{
try
{
MethodInfo method = this.GetType().GetMethod("Visit",
BindingFlags.ExactBinding | BindingFlags.Public | BindingFlags.Instance,
Type.DefaultBinder, new Type[] { visitable.GetType() }, new ParameterModifier[0]);
if (method != null)
// Avoid StackOverflow exceptions by executing only if the method and visitable
// are different from the last parameters used.
if (method != _lastmethod || visitable != _lastvisitable)
{
_lastmethod = method;
_lastvisitable = visitable;
method.Invoke(this, new object[] { visitable });
}
}
catch (Exception ex)
{
if (ex.InnerException != null)
throw ex.InnerException;
throw ex;
}
}
}
MethodInfo
instance:
MethodInfo method = this.GetType().GetMethod("Visit",
BindingFlags.ExactBinding | BindingFlags.Public | BindingFlags.Instance,
Type.DefaultBinder, new Type[] { visitable.GetType() }, new ParameterModifier[0]);
GetType
in the current object, which returns the Type
object corresponding to the visitor being used. The method GetMethod
receives the following parameters:
name
: this is the method name, "Visit".bindingAttr
: this is a flag of typeBindingFlags
which specifies options for the member search.Public
andInstance
flags must be used together, andExactBinding
is used to avoid reentrancy on this method. This may happen when an specific product is not implemented in the visitor, and thus we could get a match against theIVisitor.Visit
implementation, which is the method we are executing. This could result in a stack overflow.binder
: this is the binder used to find the method. Passing null in this parameter has the same effect as passingType.DefaultBinder
, which is the default value.types
: an array ofType
objects representing the number, order, and type of the parameters for the method to get.parameters
: this parameter adds information about the types parameter, but isn't used by theDefaultBinder
, so we pass a zero-lenght array.
MethodInfo
instance. The next lines deal with a special case: if the visitor calls base.Visit
in a method which is not an override (of course this is a developer's mistake), the compiler will not complain, because the base class has a method which satisfies the call, the IVisitor.Visit
implementation. But this results in the GetMethod
call matching the same method we are currently executing, causing a stack overflow. As this is a very subtle mistake of the concrete visitor developer, we decided to prevent this from happening, even when the right solution is to remove the wrong call to the base class method.This is achieved by saving a reference to the last method called and last visitor object used. If the current ones are the same, we just skip the method call. Beware that a change in any of the two means we have a new method call:
if (method != _lastmethod || visitable != _lastvisitable)
MethodInfo
instance, passing the current object and the array of parameters (the visitable object):
method.Invoke(this, new object[] { visitable });
TargetInvocationException
. The undelying exception thrown by the method is placed inside the InnerException
property. We check for its presence and rethrow it if appropiate.
Samples
NMatrix XGoF (SourceForge site) uses this visitor implementation to build a tree based on an XSD schema. The new hierarchy, which resembles the original schema, is ready to be visited with multiple components which perform automatic code generation based on the element being visited. For example, there are ClassBuilder, CollectionBuilder, PropertyBuilder and MethodBuilder visitors. Each of them is responsible to generate code for an specific part of the output classes.Bibliography:
[1] - Design Patterns, Elements of Reusable Object-Oriented Software. Amazon, Addison-Wesley
[2] - Reflect on the Visitor design pattern.
[3] - A Pattern Language to Visitors.