Critical Development

Language design, framework development, UI design, robotics and more.

The Visitor Design Pattern in C# 3.0

Posted by Dan Vanderboom on April 9, 2008

I use many common design patterns on a regular basis–composite, MVC/MVP, adapter, strategy, factory, chain of command, etc.–but I’ve never come across a situation where I felt Visitor in the classic definition (GoF) made sense.  I had read about it, but the necessity of defining the interfaces for not only the Visitor classes (that’s not so bad) but also the elements being visited, makes it seem overly complex and therefore tainted for me.  What if you don’t own the source code to the elements, and don’t want to inherit from existing types (if they’re not sealed) just to implement an IVisitedElement interface?

I wanted a less intrusive way of visiting any set of objects, without making any special demands on or assumptions about their types, and I suspected that new features in C# 3.0 would provide a way to make it elegant and terse.  What’s needed in essence is to visit each object in a collection with a common function or object, and to perform some action, transform the object in some way, and/or calculate some end result (usually an aggregation).  Can we do that without having to implement special interfaces or disrupting the code model in place?

For the sake of completeness and to serve as a baseline for other implementations, I’ll show you what the classic Visitor pattern looks like.

UML for Visitor Design Pattern

Here is the code that corresponds to this diagram.

interface IEmployeeVisitor
{
    void Visit(Employee employee);
    void Visit(Manager manager);
}

interface IVisitorElement
{
    void Accept(IEmployeeVisitor Visitor);
}

class EmployeeCollection : List<Employee>
{
    public void Accept(IEmployeeVisitor Visitor)
    {
        foreach (Employee employee in this)
        {
            employee.Accept(Visitor);
        }
    }
}

class Employee : IVisitorElement
{
    public decimal Income;
    public Employee(decimal income) { Income = income; }

    public virtual void Accept(IEmployeeVisitor Visitor)
    {
        Visitor.Visit(this);
    }
}

class Manager : Employee, IVisitorElement
{
    public decimal Bonus;
    public Manager(decimal income, decimal bonus) : base(income) { Bonus = bonus; }

    public override void Accept(IEmployeeVisitor Visitor)
    {
        Visitor.Visit(this);
    }
}

class SumIncomeVisitor : IEmployeeVisitor
{
    public decimal TotalIncome = 0;
    public void Visit(Employee employee) { TotalIncome += employee.Income; }
    public void Visit(Manager manager) { TotalIncome += manager.Income + manager.Bonus; }
}

class Program
{
    static void Main()
    {
        EmployeeCollection employees = new EmployeeCollection();
        employees.Add(new Employee(100000));
        employees.Add(new Employee(125000));
        employees.Add(new Manager(210000, 35000));

        SumIncomeVisitor visitor = new SumIncomeVisitor();
        employees.Accept(visitor);
        decimal result = visitor.TotalIncome;

        Console.WriteLine(result);
        Console.ReadLine();
    }
}

 
The first major disadvantage is the amount of plumbing that must be in place, and the two-way dependencies created, between visitors and the objects to be visited.  Though specific types aren’t hard-coded, a conceptual two-way dependency implied by the interfaces’ knowledge of each other requires forethought and special accomodations on both sides from the beginning.  Management of dependencies is always important; how well we do so determines how applications become more complex as they grow.  So whenever possible I ensure that dependencies run in one direction.  This creates natural segmentation and layering, and ensures that components can be pulled apart from each other rather than congealing into something like a tangled ball of christmas tree lights.

Instead of passing a collection of Employee objects to some calculating Visitor, we tell the Employee to accept a Visitor object, which then just turns around and calls the Visitor.  That by itself seems rather indirect and convoluted.  Visiting a single element isn’t very exciting.  Nothing very interesting happens until you have a whole bunch of things to work with.  So in order to visit a collection, a custom collection type is defined with an Accept method that in turn calls Accept on each Employee.  This custom collection is yet another type we’re required to write when otherwise a List<Customer> or something similar would suffice.  And what happens when your data structure is something other than a basic list?  What if you have a tree of objects you’d like to visit?  Would you then have to implement a tree data structure that is visitor friendly?  How many aggregation types do you want to reinvent with visitation specifically in mind?

The rest of it isn’t so bad.  The SumIncomeVisitor class contains both the processing logic and state for any calculations needed by that Visitor.  One of these is instantiated (another extra step), passed to the collection’s Accept method, and therefore executed against all employees in the collection.  After all objects are visited, the SumIncomeVisitor object contains the final result.  This all works, but seems pretty klunky.  Perhaps the pattern is more interesting if IVisitorElement classes provide more sophisticated Accept implementations.  I can’t think of any examples off-hand but I’ll be thinking about and looking for these.

The code above is just shy of 80 lines long.  Can we accomplish exactly the same goal with less code, more simply and clearly?

class Employee
{
    public decimal Income;
    public Employee(decimal income) { Income = income; }
}

class Manager : Employee
{
    public decimal Bonus;
    public Manager(decimal income, decimal bonus) : base(income) { Bonus = bonus; }
}

class Program
{
    static void Main()
    {
        List<Employee> employees = new List<Employee>();
        employees.Add(new Employee(100000));
        employees.Add(new Employee(125000));
        employees.Add(new Manager(210000, 35000));

        decimal TotalIncome = 0;
        employees.ForEach(e => SumEmployeeIncome(e, ref TotalIncome));

        Console.WriteLine(TotalIncome);
        Console.ReadLine();
    }

    static void SumEmployeeIncome(Employee employee, ref decimal TotalIncome)
    {
        TotalIncome += employee.Income;

        if (employee is Manager)
            TotalIncome += (employee as Manager).Bonus;
    }
}

 
In this code, you’ll notice a few simplifications:
  1. There are no IVisitorElement or IEmployeeVisitor interfaces.
  2. Employee and Manager types exist without any knowledge of or explicit support for being visited.
  3. No custom collection is required, so a basic List<Employee> is used.

In order to make this work, we need the same basic things that we needed before: visiting/processing logic, and a place to store state for that processing.  In the second approach, the state is stored in the TotalIncome variable within the Main method, where the calculation is being requested, and the processing logic kept in another method of the same class.  I could have declared TotalIncome as a class variable, but I’d really like to restrict any “scratch pad” data used in a calculation to have as restricted a scope as possible.  In the classic Visitor pattern, the data is encapsulated with the processing logic.  By calling a method with a secondary ref parameter, I can declare TotalIncome within the Main method and avoid cluttering the class definition with data that’s only relevant to one method’s logic.  This is a lighter-weight, more in-line approach than defining separate types and having to instantiate a Visitor object (Visitor Object vs. Visitor Method).

The actual mechanism for visiting every object is the ForEach method.  The List<T> class includes a very useful ForEach method that allows you to pass in an Action<T> delegate to execute a method for each element.  ForEach can’t take a method with our second ref parameter; it can only accept an Action<T> delegate.  The lambda expression e => SumEmployeeIncome(e, ref TotalIncome) creates an anonymous method that does in fact match Action<T>.  The parameter e is of type Employee because the employees collection is List<Employee>, which means the Employee type is inferred for Action<Employee>.  The anonymous method represented by the lambda then calls SumEmployeeIncome, passing the Employee e object through as well as the TotalIncome state to be transformed on successive calls for each Employee.

Finally, SumEmployeeIncome acts as the Visitor.  Different logic can be performed for different types where inheritance is involved, as it is with this sample, by testing for types using the is operator.  This is in contrast to the dual Visit methods taking Employee and Manager types respectively.  Actually, the classic Visitor pattern could have used the same approach in this regard.
 
Where more complex state is needed for processing, a new Visitor-state type could be created to support the processing, and by using an object for this purpose, it wouldn’t be necessary to declare or pass the parameter by reference.  Another option would simply be to declare multiple ref parameters.
 

The List.ForEach method is awfully nice, but what if you’re working with another data structure, such as an array, an ArrayList, a LinkedList<T>, or even a Tree<T>?  Defining a simple extension method can provide this tool regardless of what kind of collection you’re working with.

public static void ForEach<T>(this IEnumerable<T> collection, Action<T> action)
{
    foreach (T item in collection)
    {
        if (action != null)
            action(item);
    }
}

That’s better.  Now if only that extension method had been defined in the first place, the specific one in the List<T> class wouldn’t be necessary.

There’s an even more succinct way to accomplish the specific example above, using the Sum extension method on IEnumerable<T>.

TotalIncome = employees.Sum(e => (e is Manager) ? (e as Manager).Bonus + e.Income : e.Income);

I don’t mind writing or reading code like this, and as more functional programming constructs are merged into C#, I think it’s important to flex these mental muscles and become familiar with the syntax, but one might argue that this is a little more cryptic than the preceding example.  If the calculation was any more complicated, it would make sense to use a statement lambda with curly braces instead of the shorter expression lambda shown above.  Here’s one way it could be written as a statement lambda:

TotalIncome = employees.Sum(e =>
    {
        decimal result = e.Income;

        if (e is Manager)
            result += (e as Manager).Bonus;

        return result;
    });

You can see that there is more opportunity here to perform other actions and participate in more complex calculations.  This approach is even lighter-weight than the second approach suggested above using a separate named method and external state passed by reference.  The approach you take should depend on the needs and constraints of the situation.  Lighter-weight approaches are good for ad-hoc processing, whereas the heavier approaches make more sense if the visiting logic needs to be reused in other places.

If we don’t need to share state across visitations of objects in a collection, we could simply use extension methods, which is the simplest option of all.  After all, the original intent of the Visitor pattern was to allow us to add functionality to a class without modifying the original element’s type.  According to dofactory.com:

Represent an operation to be performed on the elements of an object structure. Visitor lets you define a new operation without changing the classes of the elements on which it operates.

Extension methods “add” functionality to existing classes, or at least create a compelling illusion that they do.  Just reference an assembly and import the right namespace to add the operations.  It is possible to share state, but only by doing something like passing in some shared state object to each call.

If C# had the ability to define a variable that would be static to a defined closure in a method (which it doesn’t), we could use extension methods all the time without any drawbacks.  (I miss the static local variable feature of Turbo Pascal.)

Conclusion

With the use of lambda expressions and extension methods, we’ve been able to cut the amount of code for the Visitor pattern by more than half and found that there was no need to specially prepare the data model classes to support visitation.  While the classic Visitor pattern may have more potential in complex and custom Accept scenarios, in general the need to visit elements of a collection can be better accomplished with the judicious use of available language features in C# than by blindly following classic design patterns without consideration for how relevant they really are.

While I certainly encourage developers to become familiar with common patterns, I also encourage them to think carefully about the code they’re writing, and to ask themselves if it’s as clear and simple as it can be.  As software systems grow in size–sometimes becoming victims of their own success–small inefficiencies and muddied designs can snowball into unmanageability.  Apply a simple solution first, and then add complexity only when necessary.

Doubt and ask why constantly.  Be educated and familiar with the literature, but don’t dogmatically accept everything you read: think for yourself and hone your skills at every opportunity.  While the goals and forces in software tend to remain constant over time, the forms that made sense years ago may become unnecessary with today’s tools.

Advertisements

10 Responses to “The Visitor Design Pattern in C# 3.0”

  1. Ian said

    I was under the impression that its code that switches on type e.g.


    if (employee is Manager)
    TotalIncome += (employee as Manager).Bonus;

    …that the visitor pattern is designed to avoid.

    • Dan Vanderboom said

      If type is relevant to the algorithm, as it is here, then it must be specified somewhere regardless of the pattern. In the case of the standard visitor pattern, you’d have to create a special visitor class to calculate things differently. But why create whole new types instead of just encoding this simply in small, conditional statements?

      • wcoenen said

        Because the compiler will assist you whenever you add a new visitable. The idea is that you have many different algorithms which “visit” a bunch of different domain objects and need to do different things depending on their type. If you add a new type of domain object, then you don’t want to manually find and inspect each of these algorithms to make sure they can handle the new type.

        You can add default behavior for unknown types, but that’s not always going to be correct. You can throw exceptions for unknown types, but you’ll only find out about these at run-time.

        With the visitor pattern, you simply add a new method to IVisitor interface to visit the new type, and the compiler will force you to update all visitor implementations so that they handle the new type correctly. Compiler errors are trivial to find (the compiler does it for you) and much easier to fix than run-time bugs.

  2. Ian said

    Yes I think you may have a point. There’s surely not that much difference between these two…


    class VisitorClass
    {
    public void Visit(Employee employee) { ... };
    public void Visit(Manager manager) { ... };
    }

    void VisitorMethod(Employee employee)
    {
    if (employee is Employee) { ... }
    else if (employee is Manager) { ... };
    }

    If you add a new type, the class needs a new method and the method needs a new conditional block. The method, as you say of course, also avoids all the boiler plate code.

  3. Ian said

    Perhaps there’s a nice way of doing this where you could have a VisitorClass (to avoid the messy conditional) and then mark each method with an attribute that identified the type it was to be called with.

    Maybe a VisitorClass base class could initialize a look up table based on the methods and their attributes? This is really off the top of my head though.

  4. Christo said

    You can use a productivity tool to help reduce the effort to produce the boiler plate code needed for the visitor pattern. I’ve created a template which adds all the files I typically need.

    I find that I usually create and modify my models relatively infrequently, but add functionality more frequently. The time spent adding the boiler plate is offset by the advantage I have, when in the future, I need to modify the model structure and the feeling I get knowing that the compiler pointed out all the work I needed to do.

    I agree that it is difficult to distribute the models across assemblies, since the visitor interface is tightly coupled to the model elements. That is the biggest disadvantage of the visitor pattern in my opinion.

    I tend to place the burden of visiting an array or enumeration of IVisitorElement in the visitor interface.

    interface IEmployeeVisitor
    {
    void Dispatch(IEnumerable elements);
    void Dispatch(params IVisitorElement[] elements);

    void Visit(Employee employee);
    void Visit(Manager manager);
    }

    I pair this with a helper base class from which I extend my visitors.

    abstract class EmployeeVisitorBase
    {
    void Dispatch(IEnumerable elements)
    {
    foreach(var element in elements) { element.Accept(this); }
    }

    void Dispatch(params IVisitorElement[] elements)
    {
    foreach(var element in elements) { element.Accept(this); }
    }

    void Visit(Employee employee);
    void Visit(Manager manager);
    }

  5. accacc said

    this makes about simple way ,

    class Program
    {
    public abstract class EmployeeBase
    {
    public decimal Income;
    public EmployeeBase(decimal income)
    {
    Income = income;
    }

    public abstract decimal Calculate();

    }

    public class Employee : EmployeeBase
    {
    public Employee(decimal income)
    : base(income)
    {

    }

    public override decimal Calculate()
    {
    return Income;
    }
    }

    class Manager : EmployeeBase
    {
    public decimal Bonus;
    public Manager(decimal income, decimal bonus)
    : base(income)
    {
    Bonus = bonus;

    }

    public override decimal Calculate()
    {
    return Bonus + Income;
    }
    }

    static void Main(string[] args)
    {
    List employees = new List();
    employees.Add(new Employee(100000));
    employees.Add(new Employee(125000));
    employees.Add(new Manager(210000, 35000));
    decimal total = 0;
    foreach (EmployeeBase item in employees)
    {
    total += item.Calculate();
    }

    Console.WriteLine(total);
    Console.ReadLine();
    }
    }

  6. Bjorn Coltof said

    This is really a bad example of a ‘solution’ to the visitor problem without getting the coupling between the interfaces. The presented solution at the end doesn’t need to differentiate between a Manager or an Employee as it can work with the base class interface. Visitor is a pattern you use when you CAN’T use the base interface to accomplish what you want.

    • Dan Vanderboom said

      You could, however, easily test the type and branch your logic accordingly. And I’m not saying Visitor isn’t ever useful anywhere (it’s very useful for expression and language parsing), just that it’s very heavily overused, adding complex dependencies when it’s often easy to solve the same problem without those dependencies.

  7. […] http://weblogs.asp.net/cazzu/articles/25488.aspx https://dvanderboom.wordpress.com/2008/04/09/the-visitor-pattern-in-c-30/ http://aviadezra.blogspot.com/2008/12/design-patterns-visitor-vs-strategy.html […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: