Critical Development

Language design, framework development, UI design, robotics and more.

Archive for the ‘Object Oriented Design’ Category

Language Design: Complexity, Extensibility, and Intention

Posted by Dan Vanderboom on June 14, 2010

Introduction

The object-oriented approach to software is great, and that greatness draws from the power of extensibility.  That we can create our own types, our own abstractions, has opened up worlds of possibilities.  System design is largely focused on this element of development: observing and repeating object-oriented patterns, analyzing their qualities, and adding to our mental toolbox the ones that serve us best.  We also focus on collecting libraries and controls because they encapsulate the patterns we need.

This article explores computer languages as a human-machine interface, the purpose and efficacy of languages, complexity of syntactic structure, and the connection between human and computer languages.  The Archetype project is an on-going effort to incorporate these ideas into language design.  In the same way that some furniture is designed ergonomically, Archetype is an attempt to design a powerful programming language with an ergonomic focus; in other words, with the human element always in mind.

Programming Language as Human-Machine Interface

A programming language is the interface between the human mind and executable code.  The point isn’t to turn human programmers into pure mathematical or machine thinkers, but to leverage the talent that people are born with to manipulate abstract symbols in language.  There is an elite class of computer language experts who have trained themselves to think in terms of purely functional approaches, low-level assembly instructions, or regular, monotonous expression structures—and this is necessary for researchers pushing themselves to understand ever more—but for the every day developer, a more practical approach is required.

Archetype is a series of experiments to build the perfect bridge between the human mind and synthetic computation.  As such, it is based as much as possible on a small core of extensible syntax and maintains a uniformity of expression within each facet of syntax that the human mind can easily keep separate.  At the same time, it honors syntactic variety and is being designed to shift us closer to a balance where all of the elements, blocks, clauses and operation types in a language can be extended or modified equally.  These represent the two most important design tenets of Archetype: the intuitive, natural connection to the human mind, and the maximization of its expressive power.

These forces often seem at odds with each other—at first glance seemingly impossible to resolve—and yet experience has shown that the languages we use are limited in ways we’re often surprised by, indicating that processes such as analogical extension are at work in our minds but not fully leveraged by those languages.

Syntactic Complexity & Extensibility

Most of a programming language’s syntax is highly static, and just a few areas (such as types, members, and sometimes operators) can be extended.  Lisp is the most famous example of a highly extensible language with support for macros which allow the developer to manipulate code as if it were data, and to extend the language to encode data in the form of state machines.  The highly regular, parenthesized syntax is very simple to parse and therefore to extend… so long as you don’t deviate from the parenthesized form.  Therefore Lisp gets away with powerful extensibility at the cost of artificially limiting its structural syntax.

In Lisp we write (+ 4 5) to add two numbers, or (foo 1 2) to call a function with two parameters.  Very uniform.  In C we write 4 + 5 because the infix operator is what we grew up seeing in school, and we vary the syntax for calling the function foo(1, 2) to provide visual cues to the viewer’s brain that the function is qualitatively something different from a basic math operation, and that its name is somehow different from its parameters.

Think about syntax features as visual manifestations of the abstract logical concepts that provide the foundation for all algorithmic expression.  A rich set of fundamental operations can be obscured by a monotony of syntax or confused by a poorly chosen syntactic style.  Archetype involves a lot of research in finding the best features across many existing languages, and exploring the limits, benefits, problems, and other details of each feature and syntactic representation of it.

Syntactic complexity provides greater flexibility, and wider channels with which to convey intent.  This is why people color code file folders and add graphic icons to public signage.  More cues enable faster recognition.  It’s possible to push complexity too far, of course, but we often underestimate what our minds are capable of when augmented by a system of external cues which is carefully designed and supported by good tools.

Imagine if your natural spoken language followed such simple and regular rules as Lisp: although everyone would learn to read and write easily, conversation would be monotonous.  Extend this to semantics, for example with a constructed spoken language like Lojban which is logically pure and provably unambiguous, and it becomes obvious that our human minds aren’t well suited to communicating this way.

Now consider a language like C with its 15 levels of operator precedence which were designed to match programmers’ expectations (although the authors admitted to getting some of this “wrong”, which further proves the point).  This language has given rise to very popular derivatives (C++, C#, Java) and are all easily learned, despite their syntactic complexity.

Natural languages and old world cities have grown with civilization organically, creating winding roads and wonderful linguistic variation.  These complicated structures have been etched into our collective unconscious, stirring within us and giving rise to awareness, thought, and creativity.  Although computers are excellent at processing regular, predictable patterns, it’s the complex interplay of external forces and inner voices that we’re most comfortable with.

Risk, Challenge & Opportunity

There are always trade-offs.  By focusing almost all extensibility in one or two small parts of a language, semantic analysis and code improvement optimizations are easier to develop and faster to execute.  Making other syntactical constructs extensible, if one isn’t careful, can create complexity that quickly spirals out of control, resulting in unverifiable, unpredictable and unsafe logic.

The way this is being managed in Archetype so far isn’t to allow any piece of the syntax tree to be modified, but rather to design regions of syntax with extensibility points built-in.  Outputting C# code as an intermediary (for now) lays a lot of burden on the C# compiler to ensure safety.  It’s also possible to mitigate more computationally expensive semantic analysis and code generation by taking advantage of both multicore and cloud-based processing.  What helps keep things in check is that potential extensibility points are being considered in the context of specific code scenarios and desired outcomes, based on over 25 years of real-world experience, not a disconnected sense of language purity or design ideals.

Creating a language that caters to the irregular texture of thought, while supporting a system of extensions that are both useful and safe, is not a trivial undertaking, but at the same time holds the greatest potential.  The more that computers can accommodate people instead of forcing people to make the effort to cater to machines, the better.  At least to the extent that it enables us to specify our designs unambiguously, which is somewhat unnatural for the human mind and will always require some training.

Summary

So much of the code we write is driven by a set of rituals that, while they achieve their purpose, often beg to be abstracted further away.  Even when good object models exist, they often require intricate or tedious participation to apply (see INotifyPropertyChanged).  Having the ability to incorporate the most common and solid of those patterns into language syntax (or extensions which appear to modify the language) is the ultimate mechanism for abstraction, and goes furthest in minimizing development effort.  By obviating the need to write convoluted yet routine boilerplate code, Archetype aims to filter out the noise and bring one’s intent more clearly into focus.

Posted in Archetype Language, Composability, Design Patterns, Language Extensions, Language Innovation, Linguistics, Metaprogramming, Object Oriented Design, Software Architecture | 2 Comments »

Strongly-Typed, Dynamic Linq Order Operator

Posted by Dan Vanderboom on August 20, 2009

A Community Solution

I love social technologies like Stack Overflow, where people can collaborate loosely to share knowledge and help get things done.  Stack Overflow does on a large scale what developer blogs like mine have been doing on a smaller scale: creating a community around the sharing of ideas and methods.

Every once in a while, I get some great feedback that includes a fix for one of my bugs, a performance tweak I didn’t realize was possible, or an extension to some library I left unfinished.

This morning I ran into a question about my dynamic Linq sort, solved and answered by “Ch00k”, allowing one to get compile-time checking of identifier names.  Well done!

(It’s too bad Stack Overflow doesn’t promote the use of real names for professional developers to better market themselves with skill and reputation.)

My original article (with source code) is here.  All I added to the library was this:

public static IOrderedEnumerable<T> Order<T>(this IEnumerable<T> Source, 
    Expression<Func<T, object>> Selector, SortDirection SortDirection)
{
    return Order(Source, (Selector.Body as MemberExpression).Member.Name, SortDirection);
}

To test it, I used this code:

IEnumerable<Customer> Customers = new Customer[] { new Customer("Dan", "Vanderboom"), new Customer("Steve", "Vanderboom"), 
    new Customer("Tracey", "Vanderboom"), new Customer("Sarah", "Barkelew") };

Customers = Customers.Order(c => c.LastName, SortDirection.Ascending);
Customers = Customers.Order(c => c.FirstName, SortDirection.Descending);

foreach (var cust in Customers)
{
    Console.WriteLine(cust.FirstName + " " + cust.LastName);
}

Now I can refactor these data model classes with a tool and all my dynamic sorting queries will stay in sync!

Posted in Collaboration, Design Patterns, Dynamic Programming, Language Extensions, LINQ, Object Oriented Design, Open Source, Social Networking | Tagged: , , , | 3 Comments »

A KeyedList Implementation: Syntax Subtleties for an Intuitive API

Posted by Dan Vanderboom on November 28, 2008

[This article is a follow-up on the theme of data structures started in my article on a Non-Binary Tree Data Structure.] 

Clear API Design

When designing object models, there are often times when a Dictionary is the best choice for rapid lookup and access of items.  However, when attempting to make those object models as intuitive and simple as possible, sometimes the fact that a dictionary is being used is an implementation detail and needn’t be exposed externally.

Take this model, for example:

class Database
{
    public IEnumerable<Table> Tables;

    public Database()
    {
        _Tables = new List<Table>();
    }
}

class Table
{
    public string TableName;

    public Table(string TableName)
    {
        this.TableName = TableName;
    }
}

This is nice and simple.  I can loop through the Tables collection like this:

Database db = new Database();

foreach (var table in db.Tables)
{
    Console.WriteLine(table.TableName);
}

So it's too bad that the consumer end of this has to change when we use a dictionary.  By making a change to our Database type...

class Database
{
    public Dictionary<string, Table> Tables;

    public Database()
    {
        Tables = new Dictionary<string, Table>();
    }
}

... we now have to specify the "Values" collection to get an IEnumerable<T> of the correct type.

foreach (var table in db.Tables.Values)
{
    Console.WriteLine(table.TableName);
}

The consumers of our API may not care about iterating through key-value pairs, but now they have to remember to use this Values property or face the wrath of red squiggles and compiler errors when they forget.  Of course, there’s a pattern we could use to hide this dictionary inner goo from the outside.

class Database
{
    public IEnumerable<Table> Tables { get { return _Tables.Values; } }
    private Dictionary<string, Table> _Tables { get; set; }

    public Database()
    {
        _Tables = new Dictionary<string, Table>();
    }
}

Now from the outside, we can foreach over db.Tables, but inside we can use the Dictionary for fast access to elements by key.

The Need for a Dictionary-List Hybrid

This is an either-or approach: that is, it assumes that the API consumer is better off with an IEnumerable collection and won’t have any need for keyed access to data (or even adding data, in this case).  How can we have the best of both words, with the ability to write this kind of code?

Database db = new Database();

foreach (var table in db.Tables)
{
    Console.WriteLine(table.TableName);
}

Table t = db.Tables["Customers"];

This is a hybrid of a Dictionary and a List.  (Don’t confuse this with a HybridDictionary, which is purely a dictionary with runtime-adapting storage strategies.)  It provides an IEnumerable<T> enumerator (instead of IEnumerable<K, T>), as well as an indexer for convenient lookup by key.

There’s another aspect of working with dictionaries that has always bugged me:

db.Tables.Add("Vendors", new Table("Vendors"));

This is repetitive, plus it says the same thing twice.  What if I misspell my key in one of these two places?  What I’d really like is to tell my collection which property of the Table class to use, and have it fill in the key for me.  How can I do that?  Well, I know I can select a property value concisely (in a compiler-checked and refactoring-friendly way) with a lambda expression.  So perhaps I can supply that expression in the collection’s constructor.  I decided to call my new collection KeyedList<K, T>, which inherits from Dictionary so I don’t have to do all the heavy lifting.  Here’s how construction looks:

Tables = new KeyedList<string, Table>(t => t.TableName);

Now I can add Table objects to my collection, and the collection will use my lambda expression to fill in the key for me.

Tables.Add(new Table("Vendors"));

How does this work, exactly?  Here's a first cut at our KeyedList class:

public class KeyedList<K, T> : Dictionary<K, T>, IEnumerable<T>
{
    private Func<T, K> LinkedValue;

    public KeyedList()
    {
        LinkedValue = null;
    }

    public KeyedList(Func<T, K> LinkedValue)
    {
        this.LinkedValue = LinkedValue;
    }

    public void Add(T item)
    {
        if (LinkedValue == null)
            throw new InvalidOperationException("Can't call KeyedList<K, T>.Add(T) " +
                "unless a LinkedValue function has been assigned");

        Add(LinkedValue(item), item);
    }

    public new IEnumerator<T> GetEnumerator()
    {
        foreach (var item in Values)
        {
            yield return item;
        }
        yield break;
    }
}

This is still pretty simple, but I can think of one thing that it’s missing (aside from a more complete IList<T> implementation).  With a collection class like this, with tightly-integrated knowledge about the relationship between the key property of an item and the key in the Dictionary, what happens when we change that key property in the item?  Suddenly it doesn’t match the dictionary key, and we have to remember to update this in an explicit separate step in our code whenever this happens.  It seems that this is a great opportunity to forget something and introduce a bug into our code.  How could our KeyedCollection class track and update this for us?

Unfortunately, there’s no perfect solution.  “Data binding” in .NET is weak in my opinion, and requires implementation of INotifyPropertyChanged in our classes to participate; and when it does so, we only get notification of the property name that changed (supplied as a string), and have no idea what the old value was unless we store that somewhere ourselves.  Automatically injecting all classes with data binding code isn’t practical, of course, even using AOP (since many BCL classes, for example, reside in signed assemblies).  Hopefully a future CLR will be able to perform some tricks, such as intelligently and dynamically modifing those classes, for which other class’s data binding code specify interest, so we can have effortless and universal data binding.

Now back to reality.  I want to mention that although my code typically works just as well in Compact Framework as it does in Full Framework, I’m going off the reservation here.  I’m going to be using expression trees, which are not supported in Compact Framework at all.

Expressions

The Expression class (in System.Linq.Expressions) is really neat.  With it, you can wrap a delegate type to create an expression tree, which you can explore and modify, and at some point even compile into a function which you can invoke.  The best part is that lambda expressions can be assigned to Expression types in the same way that they can be assigned to normal delegates.

Func<int> func = () => 5;
Expression<Func<int>> expr = () => 5;

The first line defines a function that returns an int, and a function is supplied as a lambda that returns the constant 5.  The second line defines an expression tree of a function that returns an int.  This extra level of indirection allows us to take a step back and look at the structure of the function itself in a precompiled state.  The structure is a tree, which can be arbitrarily complex.  You can think of this as a way of modeling the expression in a data structure.  While func can be executed immediately, expr requires that we compile it by calling the Compile method (which generates IL for the method and returns Func<int>).

int FuncResult1 = func.Invoke();
int FuncResult2 = func();

int ExprResult1 = expr.Compile().Invoke();
int ExprResult2 = expr.Compile()();

The first two lines are equivalent, as are the last two.  I just wanted to point out here, with the two ways of calling the functions, how they are in fact the same, even though the last line looks funky.

Synchronizing Item & Dictionary Keys

So why do we need expressions?  Because we need to know the name of the property we’ve supplied in our KeyedList constructor.  You can’t extract that information out of a function (supplied as a lambda expression or otherwise).  But expressions contain all the metadata we need.  Note that for this synchronization to work, it requires that the items in our collection implement INotifyPropertyChanged.

class Table : INotifyPropertyChanged
{
    public event PropertyChangedEventHandler PropertyChanged;

    private string _TableName;
    public string TableName
    {
        get { return _TableName; }
        set
        {
            _TableName = value;

            if (PropertyChanged != null)
                PropertyChanged(this, new PropertyChangedEventArgs("TableName"));
        }
    }

    public Table(string TableName)
    {
        this.TableName = TableName;
    }
}

This is tedious work, and though there are some patterns and code snippets I use to ease the burden a little, it’s still a lot of work to go through to implement such a primitive ability as data binding.

In order to get at the expression metadata, we’ll have to update our constructor to ask for an expression:

public KeyedList(Expression<Func<T, K>> LinkedValueExpression)
{
    this.LinkedValueExpression = LinkedValueExpression;
}

We’ll also need to define a field to store this, and a property will help to compile it for us.

private Func<T, K> LinkedValue;

private Expression<Func<T, K>> _LinkedValueExpression;
public Expression<Func<T, K>> LinkedValueExpression
{
    get { return _LinkedValueExpression; }
    set
    {
        _LinkedValueExpression = value;
        LinkedValue = (value == null) ? null : _LinkedValueExpression.Compile();
    }
}

Now that the groundwork has been set, we can hook into the PropertyChanged event if it’s implemented, which we do by shadowing the Add method.

public new void Add(K key, T item)
{
    base.Add(key, item);

    if (typeof(INotifyPropertyChanged).IsAssignableFrom(typeof(T)))
        (this[key] as INotifyPropertyChanged).PropertyChanged += new PropertyChangedEventHandler(KeyList_PropertyChanged);
}

One caveat about this approach: our shadowing method Add will unfortunately not be called if accessed through a variable of the base class.  That is, if you assign a KeyedList object to a Dictionary object, and call Add from that Dictionary variable, the Dictionary.Add method will be called and not KeyedList.Add, so synchronization of keys will not work properly in that case.  It’s extremely rare that you’d do such a thing, but I want to point it out regardless.  As inheritor of a base class, I would prefer the derived class be in fuller control of these behaviors, but we work with what we have.  I’ll actually take advantage of this later on in a helper method.

Finally, the tricky part.  We need to examine our lambda’s expression tree and extract the property or field name from it.  We’ll compare that to the property name reported to us as changed.  The comparison is actually done between two MemberInfo variables, which is possible because reflection ensures that only one MemberInfo object will exist for each member.  The MemberExpression object, which iniherits from Expression, possesses a Member property, and the other we get from typeof(T).GetMember.  Here’s what that looks like:

private void KeyList_PropertyChanged(object sender, PropertyChangedEventArgs e)
{
    var lambda = LinkedValueExpression as LambdaExpression;
    if (lambda == null)
        return;

    var expr = lambda.Body as MemberExpression;
    if (expr == null)
        return;

    MemberInfo[] members = typeof(T).GetMember(e.PropertyName, MemberTypes.Property | MemberTypes.Field, BindingFlags.Instance | BindingFlags.Public);
    if (members.Length == 0)
        throw new ApplicationException("Field or property " + e.PropertyName + " not found in type " + typeof(T).FullName);

    MemberInfo mi = members[0];
    if (mi == expr.Member)
    {
        // we don't know what the old key was, so we have to find the object in the dictionary
        // then remove it and re-add it
        foreach (var kvp in KeyValuePairs)
        {
            if ((typeof(T).IsValueType && kvp.Value.Equals(sender)) || (kvp.Value as object) == (sender as object))
            {
                T item = this[kvp.Key];
                Remove(kvp.Key);
                Add(item);
                return;
            }
        }
    }
}

This code makes an important assumption; namely, that a lambda expression will be used, which will contain a single field or property access.  It does not support composite or calculated keys, such as (t.SchemaName + “.” + t.TableName), though it’s possible.  I’m currently working on a method that recursively explores an Expression tree and checks for member access anywhere in the tree, to support scenarios like this.  For now, and for the purpose of this article, we’ll stick to the simple case of single member access.

I found that having access to the list of KeyValuePairs was actually useful in my code, and to keep the PropertyChanged handler concise, I added a new KeyValuePairs property to expose the base Dictionary’s enumerator, which you can find in the complete listing of the KeyedList class toward the end of this article.  I now have two iterators; and the way I’ve flipped it around, the default iterator of the base class has become a secondary, named iterator of KeyedList.

Here is a test program to demonstrate the functionality and flexibility of the KeyedList class.

class Program
{
    static void Main(string[] args)
    {
        // use a lambda expression to select the member of Table to use as the dictionary key
        var Tables = new KeyedList<string, Table>(t => t.TableName);
        
        // add a Table object to our KeyList like an old-fashioned Dictionary
        Tables.Add("Customers", new Table("Customers"));

        // add a Table object to our KeyList without explicitly specifying a key
        Tables.Add(new Table("Vendors"));

        // prove that a change in an item's key property updates the corresponding dictionary key
        Table Vendors = Tables["Vendors"];
        Vendors.TableName = "VENDORS";
        Console.WriteLine("Is 'Vendors' a valid dictionary key? " + Tables.ContainsKey("Vendors"));
        Console.WriteLine("Is 'VENDORS' a valid dictionary key? " + Tables.ContainsKey("VENDORS"));

        // prove that the IEnumerable<T> iterator works
        // note that we don't have to loop through Tables.Values
        foreach (var table in Tables)
        {
            Console.WriteLine(table.TableName);
        }

        Console.ReadKey();
    }
}

Complete Source for KeyedList

For your convenience, here is the complete listing of KeyedList.

public class KeyedList<K, T> : Dictionary<K, T>, IEnumerable<T>
{
    private Func<T, K> LinkedValue;

    private Expression<Func<T, K>> _LinkedValueExpression;
    public Expression<Func<T, K>> LinkedValueExpression
    {
        get { return _LinkedValueExpression; }
        set
        {
            _LinkedValueExpression = value;
            LinkedValue = (value == null) ? null : _LinkedValueExpression.Compile();
        }
    }

    public KeyedList()
    {
        LinkedValueExpression = null;
    }

    public KeyedList(Expression<Func<T, K>> LinkedValueExpression)
    {
        this.LinkedValueExpression = LinkedValueExpression;
    }

    public void Add(T item)
    {
        if (LinkedValue == null)
            throw new InvalidOperationException("Can't call KeyedList<K, T>.Add(T) " +
                "unless a LinkedValue function has been assigned");

        Add(LinkedValue(item), item);
    }

    public new void Add(K key, T item)
    {
        base.Add(key, item);

        if (typeof(INotifyPropertyChanged).IsAssignableFrom(typeof(T)))
            (this[key] as INotifyPropertyChanged).PropertyChanged += new PropertyChangedEventHandler(KeyList_PropertyChanged);
    }

    private void KeyList_PropertyChanged(object sender, PropertyChangedEventArgs e)
    {
        var lambda = LinkedValueExpression as LambdaExpression;
        if (lambda == null)
            return;

        var expr = lambda.Body as MemberExpression;
        if (expr == null)
            return;

        MemberInfo[] members = typeof(T).GetMember(e.PropertyName, 
            MemberTypes.Property | MemberTypes.Field, 
            BindingFlags.Instance | BindingFlags.Public);
        if (members.Length == 0)
            throw new ApplicationException("Field or property " + e.PropertyName + " not found in type " + typeof(T).FullName);

        MemberInfo mi = members[0];
        if (mi == expr.Member)
        {
            // we don't know what the old key was, so we have to find the object in the dictionary
            // then remove it and re-add it
            foreach (var kvp in KeyValuePairs)
            {
                if ((typeof(T).IsValueType && kvp.Value.Equals(sender)) 
                    || (!typeof(T).IsValueType && (kvp.Value as object) == (sender as object)))
                {
                    T item = this[kvp.Key];
                    Remove(kvp.Key);
                    Add(item);
                    return;
                }
            }
        }
    }

    public new IEnumerator<T> GetEnumerator()
    {
        foreach (var item in Values)
        {
            yield return item;
        }
        yield break;
    }

    public IEnumerable<KeyValuePair<K, T>> KeyValuePairs
    {
        get
        {
            // because GetEnumerator is shadowed (to provide the more intuitive IEnumerable<T>), 
            // and foreach'ing over "base" isn't allowed,
            // we use a Dictionary variable pointing to "this"
            // so we can use its IEnumerable<KeyValuePair<K, T>>
            Dictionary<K, T> dict = this;
            foreach (var kvp in dict)
            {
                yield return kvp;
            }
            yield break;
        }
    }
}

Conclusion

As with the non-binary Tree data structure I created in this article, I prefer to work with more intelligent object containers that establish tighter integration between the container itself and the elements they contain.  I believe this reduces mental friction, both for author and consumer of components (or the author and consumer roles when they are the same person), and allows a single generic data structure to be used where custom collection classes are normally defined.  Additionally, by using data structures that expose more flexible surface areas, we can often reap the benefits of having powerful lookup features without locking ourselves out of the simple List facades that serve so well in our APIs.

The Achilles Heel of this solution is a weak and un-guaranteed data binding infrastructure combined with lack of support for Expressions in Compact Framework.  In other words, items have to play along, and it’s not platform universal.

Clearly, more work needs to be done.  Event handlers need to be unhooked when items are removed, optimizations could be done to speed up key synchronization, and more complex expressions could be supported without too much trouble.  Ultimately what I’d like to see is a core collection class with the ability to add and access multiple indexes (such as with a database), instead of presuming that a Dictionary with its solitary key is all we can or should use.  The hashed key of a Dictionary seems like a great adornment to an existing collection class, rather than a hard-coded stand-alone structure.  But I think this is a good start toward addressing some of the fundamental shortcomings of these existing approaches, and hopefully demonstrating the value of more intelligent collection and container classes.

Other Implementations

I was at a loss for a while as to what to call this collection class; I considered DictionaryList (and ListDictionary), as well as HashedList, before arriving at KeyedList.  To my amazement, I found several other implementations of the same kind of data structure with the same name, so it must be a good name.  The implementations here and here are more complete than mine, but neither auto-assign keys with a key-selection function or update dictionary keys using data binding, which ultimately is what I’m emphasizing here.  Hint: it wouldn’t be tough to combine what I have here with either of them.

Posted in Algorithms, Data Binding, Data Structures, Design Patterns, Object Oriented Design, Reflection | 5 Comments »

Functional Programming as Intensity of Expression

Posted by Dan Vanderboom on September 20, 2008

On my long drive home last night, I was thinking about the .NET Rocks episode with Ted Neward and Amanda Laucher on F# and functional programming.  Though they’re writing a book on F# together, it seems even they have a hard time clearly articulating what functional programming is all about, and where it’s all headed in terms of mainstream commercial use… aside from scientific and data transformation algorithms, that is (as with the canonical logging example when people explain AOP).

I think the basic error is in thinking that Functional is a Style of programming.  Yet, to say that so-called Imperative-based languages are non-functional is ridiculous.  Not in the sense that they “don’t work”, but that they’re based on Objects “instead of” Functions.

This isn’t much different from the chicken-and-egg problem.  Though the chicken-and-egg conundrum has a simple (but unobvious) answer, it doesn’t really matter whether the root of program logic is a type or a function.  If I write a C# program with a Program class, the Main static function gets called.  Some action is the beginning of a program, so one might argue that functions should be the root-most logical construct.  However, you’d then have to deal with functions containing types as well as types containing functions, and as types can get very large (especially with deep inheritance relationships), you’d have to account for functions being huge, spanning multiple code files, and so on.  There’s also the issue of types being organizational containers for functions (and other members).  Just as we use namespaces to organize our types, so we use types to organize functions.  This doesn’t prevent us from starting execution with a function or thinking of the program’s purpose functionally; it just means that we organize it inside a logical container that we think of as a “thing”.

Does this limit us from thinking of business processes as functional units?  Ted Neward suggests that we’ve been trained to look for the objects in a system, and base our whole design process on that. But this isn’t our only option for how to think about design, even in our so-called imperative languages.  If we’re thinking about it wrong, we can and should change the process; we don’t need to blame our design deficiencies on the trivial fact of which programming construct is the root one.  In fact, there’s no reason we should use any one design principle to the exclusion of others.  Looking for the things in the system is and will remain a valuable approach for discovering and defining database schemas and object models.  The very fact that “functional languages” aren’t perceived as especially useful for stateful components isn’t a fault of a style of programming, but is rather a natural consequence of functions being an incomplete aspect of a general purpose programming language.  Functional is a subset of expressive capability.

Where “functional languages” have demonstrated real value is not in considering functions as root-level constructs (this may ultimately be a mistake), but rather in increasing the flexibility of a language to be much more expressive when defining functions.  Making functions first-class citizens that can be passed as parameters, returned as function values, and stitched together with metaprogramming techniques, is a huge step in the right direction.  The use of simple constructs such as operators to match patterns, reverse the evaluation of functions and the flow of values with piping, and perform complex set- and list-based operations, all increase the expressive intensity and density of the functions in a language.  This can only add to the richness of our existing object models.

Sticking objects together in extensible and arbitrarily complex structures is routine for us, but now we’re seeing a trend toward the same kind of composability in functions.  Of course, even this isn’t new, per se; the environmental forces that demand this power just haven’t become significant enough to require that level of power in mainstream languages, because technology evolution (like evolution in general) tends to work by adapting solutions that are “good enough”.

It’s common to hear how F# is successfully incorporating “both functional and imperative” styles into one language, and this is important because what we need is not so much the transition to a functional style, as I’ve mentioned already, but a growth of greater functional expressiveness and power in existing, successful, object-oriented languages.

So let our best and favorite languages grow, and add greater expressive powers to them, not only for defining functions, but also in declaring data structures, compile-time constraints and guarantees, and anything else that will help to raise the level of abstraction and therefore the productivity with which we can naturally express and fulfill our business needs.

Ultimately, “functional programming” is not a revolutionary idea, but rather an evolutionary step forward.  Even though it’s impact is great, there’s no need to start from scratch, to throw out our old models.  Incompatibility between functional and imperative is an illusion perpetuated by an unclear understanding of their relationship and each aspect’s purpose.

Posted in Design Patterns, Functional Programming, Object Oriented Design, Problem Modeling, Software Architecture | 4 Comments »

Observations on the Evolution of Software Development

Posted by Dan Vanderboom on September 18, 2008

Neoteny in the Growth of Software Flexibility and Power

Neoteny is a biological phenomenon of an organism’s development observed across multiple generations of a species.  According to Wikipedia, neoteny is “the retention, by adults in a species, of traits previously seen only in juveniles”, and accounts for many evolutionary shifts, including the human brain’s ability to remain elastic and malleable later in life than those of our distant ancestors.

So how does this relate to software?  Software is a great deal like an organic species.  The species emerged (not long ago), incubated in a more or less fragile state for a number of decades, and continues to evolve today.  Each software application or system built is a new member of the species, and over the generations they have become more robust, intelligent, and useful.  We’ve even formed a symbiotic relationship with software.

Consider the fact that software running on computers was at one time compiled to machine language code for a specific processor.  With the invention of platform-independent instruction sets and their associated runtimes performing just-in-time compilation (Java’s JVM and .NET Framework’s CLR), we’ve delayed the actual production of machine language code until it’s actually needed on the target machine.  The compiler produces a slightly more abstract representation of the program logic, and an extra translation step at installation or runtime is needed to complete the process to make the software usable.

With the growing popularity of dynamic languages such as Lisp, Python, and the .NET Framework’s upcoming release of its Dynamic Language Runtime (DLR), we’re taking another step of neoteny.  Instead of a compiler generating instruction byte codes, a “compiler for any dynamic language implemented on top of the DLR has to generate DLR abstract trees, and hand it over to the DLR libraries” (per Wikipedia).  These abstract syntax trees (AST), normally an intermediate artifact created deep within the bowels of a traditional compiler (and eventually discarded), are now persisted as compiler output.

Traits previously seen only in juveniles… now retained by adults.  Not too much of a metaphorical stretch!  The question is: how far can we go?  And I think the answer depends on the ability of hardware to support the additional “just in time” processing that needs to occur, executing more of the compiler’s tail-end tasks within the execution runtime itself, providing programming languages with greater flexibility and power until the compilation stages we currently execute at design-time almost entirely disappear (to be replaced, perhaps, by new pre-processing tasks.)

I remember my Turbo Pascal compiler running on a 33 MHz processor with 1 MB of RAM, and now my cell phone runs at 620 MHz (with a graphics accelerator) and has gigabytes of memory and storage.  And yet with the state of things today, the inclusion of language-specific compilers within the runtime is still quite infeasible.  In the .NET Framework, there are too many potential languages that people might attempt to include in such a runtime: C#, F#, VB, Boo, IronPython, etc.  Trying to cram all of those compilers into a universal runtime that would fit (and perform well) on a cell phone or other mobile device isn’t yet feasible, which is why we have technologies with approaches like System.Reflection.Emit (on the full .NET Framework), and Mono.Cecil (which works on Compact Framework as well).  These work at the platform-independent CIL level, and so can interpret and generate programs generically, interact with each others’ components, and so on.  One metaprogramming mechanism can therefore be reused across all .NET languages, and this metalinguistic programming trend is being discussed on the C# and other language design teams.

I’ve just started using Mono.Cecil, chosen because it is cross-platform friendly (and open source).  The API isn’t very intuitive, but because the source is available, and because extension methods can go a long way to making it more accessible, it’s a great option.  The documentation is sparse, and assembly generation has some performance issues, but it’s a work-in-progress with tremendous potential.  If you’re doing any kind of static analysis or have any need to dynamically generate and consume types and assemblies (to get around language limitations, for example), I’d encourage you to check it out.  A comparison of Mono.Cecil to System.Reflection can be found here.  Another library called LinFu, which performs lots of mind-bending magic and actually uses Mono.Cecil, is also worth exploring.

VB10 will supposedly be moving to the DLR to become a truly dynamic language, which considering their history of support for late binding, makes a lot of sense.  With a dynamic language person on the C# 4.0 team (Jim Hugunin from IronPython), one wonders if C# won’t eventually go the same route, while keeping its strongly-typed feel and IDE feedback mechanisms.  You might laugh at the idea of C# supporting late binding (dynamic lookup), but this is being planned regardless of the language being static or dynamic.

As the DLR evolves, performance optimizations are being discovered and implemented that may close the gap between pre-compiled and dynamically interpreted languages.  Combine this with manageable concurrent execution, and the advantages we normally attribute to static languages may soon disappear altogether.

The Precipitous Growth of Software System Complexity

We’re truly on the cusp of a precipitous period of growth for software complexity, as an exploding array of devices and diverse platforms around the world connect in an ever-more immersive Internet.  Taking full advantage of parallel and distributed computing environments by solving the challenges of concurrency and coordination, as well as following the trend toward increased integration among software components, is pushing software complexity into new orders of magnitude.  The strategies we come up with for organizing these systems will have to take several key factors into consideration, and we will have to raise the level of abstraction to a point that may be hard for us to imagine with our existing tools and languages.

One aspect that’s clear is the rise of declarative or intention-based syntax, whether represented as XML, Domain Specific Langauges (DSL), attribute decoration, or a suite of new visual modeling editors.  This is in part a consequence of raising the abstraction level, as lower-level libraries are entrusted to solve common problems and take advantage of common opportunities.

Another is the use of Inversion of Control (IoC) containers and dependency injection in component based architectures, thereby standardizing the lifecycle of the application and its components, and providing a common environment or ecosystem for all of its components, as well as introducing a common protocol for component location, creation, access, and disposal.  This level of consistency is valuable for sharing a common understanding of how to troubleshoot software components.  The more predictable a component’s interaction with the rest of the system, the easier it is to debug and modify; conversely, the more unique it and its communication system is, the more disparity there is among components, and the more difficult to understand and modify without introducing errors.  If software is a species and applications are individuals, then components are the cells of a system.

Even the introduction of functional programming languages into the mainstream over the past couple years is due, in part, to the ability of those languages to provide more declarative support, more syntactic flexibility, and new ways of dealing with concurrency and coordination issues (such as immutable values) and light-weight, ad hoc data structures (tuples).

Balancing the Forces of Coupling, Cohesion, and Modularity

On a fundamental level, the more that components are independent, the less coupled and the more modular and flexible they are.  But the more they can communicate with and are allowed to benefit from each other, the more interdependent they become.  This adds to cohesiveness and synergy, but also stronger coupling to a community of abstractions.

A composition of services has layers and segments of interdependence, and while there are dependencies, these should be dependencies on abstractions (interfaces and not implementations).  Since there will be at least one implementation of each service, and the extensibility exists to build others as needed, dependency is only a liability when the means for fulfilling it are not extensible.  Both sides of a contract need to be fulfilled regardless; service-oriented or component-based designs merely provide a mechanism for each side to implement and fulfill its part of the contract, and ideally the system also provides a discovery mechanism for the service provider to publish its availability for other components to discover and consume it.

If you think about software components as a hierarchy or tree of services, with services of one layer depending on more root services, it’s easy to see how this simplifies the perpetual task of adding new and revising existing functionality.  You’re essentially editing an outline, and you have opportunities to move services around, reorganize dependencies easily, and have many of the details of the software’s complexity absorbed into this easy-to-use outline structure (and its supporting infrastructure).  Systems of arbitrary complexity become feasible, and then relatively routine.  There’s a somewhat steep learning curve to get to this point, but once you’ve crossed it, your opportunities extend endlessly for no additional mental cost.  At least not in terms of how to compose your system out of individual parts.

Absorbing Complexity into Frameworks

The final thing I want to mention is that a rise in overall complexity doesn’t mean that the job of software developers necessarily has to become more difficult than it is currently.  With the proper design of components that abstract away the complexity into reusable frameworks with intuitive interfaces, developers at the business logic level don’t need to be aware of the inner complexity, in the same way that software developers are largely absolved of the responsibility of thinking about the processor’s inner workings.  As we build our technology stack higher and higher, like the famed Tower of Babel, we must make sure that it’s organized and structured in a way to support that upward growth and the load imposed upon it… so it doesn’t come crashing down.

The requirements for building components tomorrow will not be the same as they were yesterday.  As illustrated in this account of the effort involved in a feature change at Microsoft, in the future, we will also want to consider issues such as tool-assisted refactorability (and patterns that frustrate this, such as “magic strings”), and due to an explosion of component libraries, discoverability of types, members, and their use.

A processor can handle any complexity of instruction and data flow.  The trick is in organizing all of this in a way that other developers can understand and work with.

Posted in Compact Framework, Component Based Engineering, Concurrency, Design Patterns, Development Environment, Distributed Architecture, Functional Programming, Mobile Devices, Object Oriented Design, Problem Modeling, Reflection, Service Oriented Architecture, Software Architecture, Visual Studio | 1 Comment »

Introduction to Port-Based Asynchronous Messaging

Posted by Dan Vanderboom on April 21, 2008

…using terminology and examples from the Concurrency & Coordination Runtime in C#

 

Port-Based Messaging

Port-based messaging has been in practice for a long time.  In the realm of low-level electronics, components have always been operating in parallel, hardware interface ports are designed around standards, and messages are posted to those ports, queuing up somewhere until the receiver is ready to process them.  This pattern has worked extremely well in multiple domains for a long time, and its key characteristic is the decoupling of data flow from control flow.

In sequential programs, one statement runs after another each time it’s run, and the behavior is predictable and repeatable. Concurrency is difficult because you have to consider all possible interleavings of multiple simultaneous tasks, so the overall behavior is nondeterministic. Depending on the relative timings of concurrent tasks, you could get different results each time if you’re not careful to set the appropriate locks on shared resources.  Port-based messaging architectures isolate islands of state across different execution contexts, and connect them with safe, atomic messages delivered through pre-defined ports.

Basic port-based asynchronous messaging

The posting of a message to a port, as shown in Figure 1, is followed by some handler method that is receiving and processing messages.  What’s not evident in the diagram, however, is that while data flows into the port, that posting is a non-blocking call.  The sender continues on doing other work, taking the time only to queue up a message somewhere.

Queuing is important because, even with large thread pools, we can’t guaranty that a receiver will be listening at the very moment a message arrives.  Letting them queue up means the receiver doesn’t have to block on a thread to wait.  Instead, the data waits and the thread checks messages on the port when it can.

Port showing message queue

What is a Port?

So what exactly is a port?  A port is a communication end point, but not in the sense of “a web service on a physical server”.  Think much more fine grained than that, even more fine-grained than methods.  With sequential programming, we commonly use try-catch blocks and handle both the exceptional and non-exceptional results of operations within a single method.  In port-based programming, you post a message to a port, which results in some handler method running on the receiving end, and different results can be sent back on different callback ports depending on the type of message.  Instead of calling a method that returns back to you when it ends, port-based programming is about always moving forward, and unwinding a call stack has very little meaning here.

Sequence diagram of port-based messaging

We can see in the sequence diagram above (Figure 3) a collection of services that communicate with and depend on each other.  Starting from the top, the left-most service posts a message to port 1, and then goes on to do other work or surrenders its thread back to the dispatcher for other tasks that are waiting to run.  A registered method on port 1 runs, and the logic there needs another service to complete it’s task, so it posts a message on port 2, and also continues processing without waiting.  The path of solid blue arrow lines traces the message path for normal execution.  If anything goes wrong, an exception can be posted to a different callback port, shown with a red outline in the diagram.

This diagram shows one possible composition of services and data flow.  Port sets, which are simply a collection of related ports, are shown as callback receivers in pairs, but they can consist of any number of ports with any mixture of messages types, depending on the needs of the system being coordinated.  In this example, if anything goes wrong in the handler methods at ports 2, 5, or 6, an exception message will be routed to port 6, where another handler method can compensate for or report on the error.  Also note that while during startup this system may have to process data at port 4 before the logic at ports 5, 7, and 8 can run… once it gets going, there could be activity operating at many ports concurrently (not just one port per service).

Arbiters, Dispatchers, and DispatcherQueues

Now it’s time to peel away some of the layers of simplification presented so far.  (It may help to have a beer or glass of wine at this point.)

An arbiter is a rule (or set of rules) about when and how to process messages for a specific port (or set of ports).  (It is helpful to think of arbiter as a data flow or message flow coordinator.)  Should messages be pulled off the queue as soon as they arrive?  Should the software wait until 5 messages have arrived before processing them all as a group?  Should messages be checked according to a timer firing every 20 seconds?  Should logic be run only when two ports have messages waiting (a join)?  What logic should be run when one of these conditions occurs?  Can method handlers on three specific ports run concurrently until a message arrives on a fourth port, whose handler must run exclusively, and when done the other three can run again (interleave)?  These are just a few of the many coordination patterns that can be expressed with different types of arbiters (and hierarchically nested arbiters, which are ingenious).

Arbiters, Dispatchers, and DispatcherQueues

Figure 4 illustrates that an arbiter is associated with a port to monitor and a method to execute under the right conditions.  The logic of the arbiter, depending on its type and configuration, determines whether to handle the message.  It gets its thread from a thread dispatcher, which contains a thread pool.  (Not the same as System.Threading.ThreadPool, though, as there can only be one of those per process.)

The next diagram (figure 5) could represent a join coordination.  An arbiter waits for messages on two ports, and depending on how it’s defined, it may process messages from one port repeatedly, but as soon as it receives a message on the second port (it may be an exception port, for example), the whole arbiter might tear itself down so that no more handling on those port will occur.  As you are probably starting to see, composition and attachment of arbiters are key to controlling specific patterns of coordination in arbitrarily powerful and flexible ways.

Multiple DispatcherQueues and complex Arbiters.

In the Concurrency & Coordination Runtime, we can attach and detach these arbiters during runtime; we don’t have to define them statically at compile time.  There has been some criticism towards this approach because dynamic arbitration rules are much more difficult to verify formally with analysis, and are therefore difficult to optimize compilation and thread management for, but the advantages of this flexibility are enormous and the performance (see this paper by Chrystanthakopoulos and Singh) has been very impressive compared to conventional multithreading approaches.  Ultimately, it’s not about whether we can guaranty 100% that nothing will go wrong using only the mathematical models currently in our repertoire, but whether we can be productive with these techniques to release software that meets acceptable quality standards across a broad range of application domains.

I don’t think we’re going to find a better set of strategies to work with anytime soon, and when we’ve pushed this technology hard enough, the tactics will be fine tuned and we can start absorbing some of these coordination concepts into languages themselves (without sacrificing the dynamism that a library of composable parts provides).  People are going to attempt concurrent programming whether it’s safe or not, and using a library such as the CCR significantly reduces the risk of ad hoc multi-threading code.

Cooperative Multitasking

When mainstream operating systems like Windows took their first steps to support multi-tasking, cooperative versus preemptive multi-tasking was a common topic.  The idea of an operating system depending on applications to surrender control in a well-behaved way was generally and rightfully considered a bad idea.  Any kind of error or instability in software could easily bring down the entire operating system, and enforcing a quickly growing community of software vendors to share nicely wasn’t a realistic option.  Being preemptive meant that the OS could forcefully stop an application from running after giving it a small, measured slice of time, and then switch the thread to a new context where another application could run for another time slice.  Regardless of how poorly applications ran, as benevolent dictator, the OS could ensure a fair scheduling of processor time.

The solution encapsulated in the Concurrency & Coordination Runtime is, on the other hand, a cooperative multi-tasking strategy.  However, because it operates within the local scope of an individual process, and is isolated from other processes in the same OS, its risk of destabilizing the system is nonexistent.  This deep level of cooperation, in fact, is what gives the CCR its great performance.  When used correctly, which George Chrysanthakopoulos (in this video) and his colleagues have brilliantly put within our reach in the CCR library, threads don’t sit around waiting on some resource or for long-running operations to complete; instead, control is freely surrendered back to the thread pool, where it is quickly assigned to a new task.

Finally, by surrendering threads freely instead of holding onto them, a continuous flow of threads through the different activities of the system is maintained, and there is therefore always an abundance of them to handle new messages waiting on ports.  Existing threads are not wasted.  As the Tao Te Ching says:

If you want to accord with the Tao,
just do your job, then let go.

Control & Data Flow: Sequential vs. Concurrent

In sequential programs, stacks are used to unwind method calls and provide return values (return messages), and threads follow the data flow; whereas in port-based programming, threads are managed by one or more thread dispatchers that are capable of maximizing the use of that thread by making it available in a pool and sharing it with with many other (potentially unrelated) tasks.  Data flows orthogonally and according to a different coordination strategy than control flow.  This task-thread agnosticism (the opposite of thread-affinity) is similar to the statelessness of a web server such as IIS; one or more threads from a large pool are injected into the tasks of processing, rendering, and serving up huge numbers of web pages, after which those threads are recycled back into the thread pool for execution of other tasks for a highly concurrent and scalable service platform.

So herein lies the trick: in order to split this coupling between data flow and control flow, a different means is needed to compose the two coordination strategies.  In C# and other popular imperative programming languages, methods implicitly pass thread control along with data arguments (the message), and the use of the stack for method calls asserts constraints on control flow, so making the CCR work involves some interesting patterns.

That’s why port-based programming is hard to get your head around.  It’s such a large shift from common sequential logic and it requires some additional planning (and good visualizations).  It’s obviously important to have a good set of tools for expressing that coordination, a simple set of conceptual primitives that allows us to compose arbitrarily-complex coordination patterns.  These primitives, including Message, Port, PortSet, Dispatcher (thread pool), and others provide the building blocks that we can use to define these relationships.  Once we define what we want to happen with these building blocks, the CCR can make it all happen.

This level of coordination is a level beyond the strategies used by most concurrent applications and frameworks in software today, primarily because there hasn’t been a pressing need for it until recently–processors had been growing phenomenally in speed for many years.  Now, however, we’re told that processor speed has plateaued, that we now have to scale out to scale up, spreading the processing load across multiple cores.  We are very fortunate that the work being done by researchers in fields like robotics can be applied in other service oriented architectures, and is a testament to the widespread use of the .NET Framework and the fantastic efforts of some very bright individuals.

Where to Find Microsoft Robotics Studio

Robotics Studio is a free download and can be found here, and while it is full of other good stuff, it’s worth checking out for the Concurrency and Coordination Runtime alone.

Posted in Compact Framework, Concurrency, Data Structures, Design Patterns, Microsoft Robotics Studio, Object Oriented Design, Service Oriented Architecture, Software Architecture | Leave a Comment »

Using Extension Methods to Manipulate Control Bitmaps in Compact Framework

Posted by Dan Vanderboom on April 11, 2008

I’m loving extension methods.  All of the methods that I wish BCL classes had, I can now add.  While I consider it highly unfortunate that we can’t yet add extension properties, events, or static members of any kind, still it’s a great amount of power in terms of making functionality discoverable in ways not possible before.

During the implementation of my Compact Framework application’s MVC framework, I wanted to support displaying views modally.  However, using a screen stack of UserControls that are all hosted in a single master Form object, I lose out on this built-in functionality and so found myself in need of creating this behavior myself.  One of the difficulties in doing this is displaying a view that may not cover every portion of other views beneath it; if the user clicks on one of the views “underneath”, that control gets activated, and if pressed on a control, that control will handle the event (such as Button.Click).

My solution to the problem is simple.  Take a snapshot of the master form and everything on it, create a PictureBox control that covers the whole form and bring it to front, and set its image to the snapshot bitmap.  Doing this provides the illusion that the user is still looking at the same form full of controls, and yet if they touch any part of the screen, they’ll be touching a PictureBox that just ignores them.  The application is then free to open a new view UserControl on top of that.  When the window is finally closed, the MVC infrastructure code tears down the PictureBox, and the real interface once again becomes available for interaction.

Screenshots before and after screen capture and darkening

In addition, I wanted the ability to emphasize the modal view, so you can see from the picture above that I decided to accomplish this by de-emphasizing the background bitmap.  By darkening the snapshot, the pop-up modal view really does seem to pop out.  The only problem with bitmap manipulation using the Compact Framework library is that it’s extremely slow, but I get around this by using some unsafe code to manipulate the memory region where the bitmap image gets mapped.  (If you’re unfamiliar with the unsafe keyword, don’t worry: this code actually is safe to use.)

Here is the full source code for taking a snapshot of a form (or any control), as well as adjusting the brightness.

using System;
using System.Linq;
using System.Collections.Generic;
using System.Text;
using System.Drawing;
using System.Drawing.Imaging;
using System.Windows.Forms;
using System.Runtime.InteropServices;

public static class ControlBitmapExtensions
{
    [DllImport("coredll.dll")]
    private static extern bool BitBlt(IntPtr hdc, int nXDest, int nYDest, int nWidth, int nHeight,
        IntPtr hdcSrc, int nXSrc, int nYSrc, int dwRop);

    public struct PixelData
    {
        public byte Blue;
        public byte Green;
        public byte Red;
    }

    public static Bitmap GetSnapshot(this Control Control)
    {
        Rectangle rect = new Rectangle(0, 0, Control.Width, Control.Height - 1);
        Graphics g = Control.CreateGraphics();
        Bitmap Snapshot = new Bitmap(rect.Width, rect.Height);
        Graphics gShapshot = Graphics.FromImage(Snapshot);
        BitBlt(gShapshot.GetHdc(), 0, 0, rect.Width, rect.Height, g.GetHdc(), rect.Left, rect.Top, 0xCC0020);
        gShapshot.Dispose();

        return Snapshot;
    }

    public static unsafe Bitmap AdjustBrightness(this Bitmap Bitmap, decimal Percent)
    {
        Percent /= 100;
        Bitmap Snapshot = (Bitmap)Bitmap.Clone();
        Rectangle rect = new Rectangle(0, 0, Bitmap.Width, Bitmap.Height);

        BitmapData BitmapBase = Snapshot.LockBits(rect, ImageLockMode.ReadWrite, PixelFormat.Format24bppRgb);
        byte* BitmapBaseByte = (byte*)BitmapBase.Scan0.ToPointer();

        // the number of bytes in each row of a bitmap is allocated (internally) to be equally divisible by 4
        int RowByteWidth = rect.Width * 3;
        if (RowByteWidth % 4 != 0)
        {
            RowByteWidth += (4 - (RowByteWidth % 4));
        }

        for (int i = 0; i < RowByteWidth * rect.Height; i += 3)
        {
            PixelData* p = (PixelData*)(BitmapBaseByte + i);

            p->Red = (byte)Math.Round(Math.Min(p->Red * Percent, (decimal)255));
            p->Green = (byte)Math.Round(Math.Min(p->Green * Percent, (decimal)255));
            p->Blue = (byte)Math.Round(Math.Min(p->Blue * Percent, (decimal)255));
        }

        Snapshot.UnlockBits(BitmapBase);
        return Snapshot;
    }

    public static Bitmap Brighten(this Bitmap Bitmap, decimal PercentChange)
    {
        return AdjustBrightness(Bitmap, 100 + PercentChange);
    }

    public static Bitmap Darken(this Bitmap Bitmap, decimal PercentChange)
    {
        return AdjustBrightness(Bitmap, 100 - PercentChange);
    }
}

 

Because Control is extended by GetSnapshot, and Bitmap is extended by AdjustBrightness, Brighten, and Darken, I can write very clear and simple code like this on the consuming side:

Bitmap bitmap = MyForm.GetSnapshot().Darken(40);

…and voila!  I have a snapshot.  Note that because Darken extends Bitmap, it can now be used with any Bitmap.  As we read from this code from left to right, we’re observing a pipeline of transformations.  MyForm is the source data, GetSnapshot is the first step, Darken is the next change, and with more extension methods on Bitmap we could continue to process this in a way that is very natural to think about and construct.

I do have to admit that I cheated a little, though.  Even with the direct memory manipulation with pointers, it still didn’t perform very well on the Symbol and DAP devices I tested on.  So instead of adjusting the brightness on every pixel, I only darken every third pixel.  They’re close enough together that you can’t really tell the difference; however, the closer to 100 percent you darken or brighten an image, the more apparent the illusion will be, since two thirds of the pixels won’t be participating.  So it’s good for subtle effects, but I wouldn’t count on it for all scenarios.

This every-third-pixel dirty trick happens in the for loop, where you see i += 3, so go ahead and experiment with it.  Just be careful not to set it to large even numbers or you’ll end up with stripes!

Posted in Algorithms, Compact Framework, Object Oriented Design, Problem Modeling, User Interface Design, Windows Forms, Windows Mobile | 5 Comments »

The Visitor Design Pattern in C# 3.0

Posted by Dan Vanderboom on April 9, 2008

I use many common design patterns on a regular basis–composite, MVC/MVP, adapter, strategy, factory, chain of command, etc.–but I’ve never come across a situation where I felt Visitor in the classic definition (GoF) made sense.  I had read about it, but the necessity of defining the interfaces for not only the Visitor classes (that’s not so bad) but also the elements being visited, makes it seem overly complex and therefore tainted for me.  What if you don’t own the source code to the elements, and don’t want to inherit from existing types (if they’re not sealed) just to implement an IVisitedElement interface?

I wanted a less intrusive way of visiting any set of objects, without making any special demands on or assumptions about their types, and I suspected that new features in C# 3.0 would provide a way to make it elegant and terse.  What’s needed in essence is to visit each object in a collection with a common function or object, and to perform some action, transform the object in some way, and/or calculate some end result (usually an aggregation).  Can we do that without having to implement special interfaces or disrupting the code model in place?

For the sake of completeness and to serve as a baseline for other implementations, I’ll show you what the classic Visitor pattern looks like.

UML for Visitor Design Pattern

Here is the code that corresponds to this diagram.

interface IEmployeeVisitor
{
    void Visit(Employee employee);
    void Visit(Manager manager);
}

interface IVisitorElement
{
    void Accept(IEmployeeVisitor Visitor);
}

class EmployeeCollection : List<Employee>
{
    public void Accept(IEmployeeVisitor Visitor)
    {
        foreach (Employee employee in this)
        {
            employee.Accept(Visitor);
        }
    }
}

class Employee : IVisitorElement
{
    public decimal Income;
    public Employee(decimal income) { Income = income; }

    public virtual void Accept(IEmployeeVisitor Visitor)
    {
        Visitor.Visit(this);
    }
}

class Manager : Employee, IVisitorElement
{
    public decimal Bonus;
    public Manager(decimal income, decimal bonus) : base(income) { Bonus = bonus; }

    public override void Accept(IEmployeeVisitor Visitor)
    {
        Visitor.Visit(this);
    }
}

class SumIncomeVisitor : IEmployeeVisitor
{
    public decimal TotalIncome = 0;
    public void Visit(Employee employee) { TotalIncome += employee.Income; }
    public void Visit(Manager manager) { TotalIncome += manager.Income + manager.Bonus; }
}

class Program
{
    static void Main()
    {
        EmployeeCollection employees = new EmployeeCollection();
        employees.Add(new Employee(100000));
        employees.Add(new Employee(125000));
        employees.Add(new Manager(210000, 35000));

        SumIncomeVisitor visitor = new SumIncomeVisitor();
        employees.Accept(visitor);
        decimal result = visitor.TotalIncome;

        Console.WriteLine(result);
        Console.ReadLine();
    }
}

 
The first major disadvantage is the amount of plumbing that must be in place, and the two-way dependencies created, between visitors and the objects to be visited.  Though specific types aren’t hard-coded, a conceptual two-way dependency implied by the interfaces’ knowledge of each other requires forethought and special accomodations on both sides from the beginning.  Management of dependencies is always important; how well we do so determines how applications become more complex as they grow.  So whenever possible I ensure that dependencies run in one direction.  This creates natural segmentation and layering, and ensures that components can be pulled apart from each other rather than congealing into something like a tangled ball of christmas tree lights.

Instead of passing a collection of Employee objects to some calculating Visitor, we tell the Employee to accept a Visitor object, which then just turns around and calls the Visitor.  That by itself seems rather indirect and convoluted.  Visiting a single element isn’t very exciting.  Nothing very interesting happens until you have a whole bunch of things to work with.  So in order to visit a collection, a custom collection type is defined with an Accept method that in turn calls Accept on each Employee.  This custom collection is yet another type we’re required to write when otherwise a List<Customer> or something similar would suffice.  And what happens when your data structure is something other than a basic list?  What if you have a tree of objects you’d like to visit?  Would you then have to implement a tree data structure that is visitor friendly?  How many aggregation types do you want to reinvent with visitation specifically in mind?

The rest of it isn’t so bad.  The SumIncomeVisitor class contains both the processing logic and state for any calculations needed by that Visitor.  One of these is instantiated (another extra step), passed to the collection’s Accept method, and therefore executed against all employees in the collection.  After all objects are visited, the SumIncomeVisitor object contains the final result.  This all works, but seems pretty klunky.  Perhaps the pattern is more interesting if IVisitorElement classes provide more sophisticated Accept implementations.  I can’t think of any examples off-hand but I’ll be thinking about and looking for these.

The code above is just shy of 80 lines long.  Can we accomplish exactly the same goal with less code, more simply and clearly?

class Employee
{
    public decimal Income;
    public Employee(decimal income) { Income = income; }
}

class Manager : Employee
{
    public decimal Bonus;
    public Manager(decimal income, decimal bonus) : base(income) { Bonus = bonus; }
}

class Program
{
    static void Main()
    {
        List<Employee> employees = new List<Employee>();
        employees.Add(new Employee(100000));
        employees.Add(new Employee(125000));
        employees.Add(new Manager(210000, 35000));

        decimal TotalIncome = 0;
        employees.ForEach(e => SumEmployeeIncome(e, ref TotalIncome));

        Console.WriteLine(TotalIncome);
        Console.ReadLine();
    }

    static void SumEmployeeIncome(Employee employee, ref decimal TotalIncome)
    {
        TotalIncome += employee.Income;

        if (employee is Manager)
            TotalIncome += (employee as Manager).Bonus;
    }
}

 
In this code, you’ll notice a few simplifications:
  1. There are no IVisitorElement or IEmployeeVisitor interfaces.
  2. Employee and Manager types exist without any knowledge of or explicit support for being visited.
  3. No custom collection is required, so a basic List<Employee> is used.

In order to make this work, we need the same basic things that we needed before: visiting/processing logic, and a place to store state for that processing.  In the second approach, the state is stored in the TotalIncome variable within the Main method, where the calculation is being requested, and the processing logic kept in another method of the same class.  I could have declared TotalIncome as a class variable, but I’d really like to restrict any “scratch pad” data used in a calculation to have as restricted a scope as possible.  In the classic Visitor pattern, the data is encapsulated with the processing logic.  By calling a method with a secondary ref parameter, I can declare TotalIncome within the Main method and avoid cluttering the class definition with data that’s only relevant to one method’s logic.  This is a lighter-weight, more in-line approach than defining separate types and having to instantiate a Visitor object (Visitor Object vs. Visitor Method).

The actual mechanism for visiting every object is the ForEach method.  The List<T> class includes a very useful ForEach method that allows you to pass in an Action<T> delegate to execute a method for each element.  ForEach can’t take a method with our second ref parameter; it can only accept an Action<T> delegate.  The lambda expression e => SumEmployeeIncome(e, ref TotalIncome) creates an anonymous method that does in fact match Action<T>.  The parameter e is of type Employee because the employees collection is List<Employee>, which means the Employee type is inferred for Action<Employee>.  The anonymous method represented by the lambda then calls SumEmployeeIncome, passing the Employee e object through as well as the TotalIncome state to be transformed on successive calls for each Employee.

Finally, SumEmployeeIncome acts as the Visitor.  Different logic can be performed for different types where inheritance is involved, as it is with this sample, by testing for types using the is operator.  This is in contrast to the dual Visit methods taking Employee and Manager types respectively.  Actually, the classic Visitor pattern could have used the same approach in this regard.
 
Where more complex state is needed for processing, a new Visitor-state type could be created to support the processing, and by using an object for this purpose, it wouldn’t be necessary to declare or pass the parameter by reference.  Another option would simply be to declare multiple ref parameters.
 

The List.ForEach method is awfully nice, but what if you’re working with another data structure, such as an array, an ArrayList, a LinkedList<T>, or even a Tree<T>?  Defining a simple extension method can provide this tool regardless of what kind of collection you’re working with.

public static void ForEach<T>(this IEnumerable<T> collection, Action<T> action)
{
    foreach (T item in collection)
    {
        if (action != null)
            action(item);
    }
}

That’s better.  Now if only that extension method had been defined in the first place, the specific one in the List<T> class wouldn’t be necessary.

There’s an even more succinct way to accomplish the specific example above, using the Sum extension method on IEnumerable<T>.

TotalIncome = employees.Sum(e => (e is Manager) ? (e as Manager).Bonus + e.Income : e.Income);

I don’t mind writing or reading code like this, and as more functional programming constructs are merged into C#, I think it’s important to flex these mental muscles and become familiar with the syntax, but one might argue that this is a little more cryptic than the preceding example.  If the calculation was any more complicated, it would make sense to use a statement lambda with curly braces instead of the shorter expression lambda shown above.  Here’s one way it could be written as a statement lambda:

TotalIncome = employees.Sum(e =>
    {
        decimal result = e.Income;

        if (e is Manager)
            result += (e as Manager).Bonus;

        return result;
    });

You can see that there is more opportunity here to perform other actions and participate in more complex calculations.  This approach is even lighter-weight than the second approach suggested above using a separate named method and external state passed by reference.  The approach you take should depend on the needs and constraints of the situation.  Lighter-weight approaches are good for ad-hoc processing, whereas the heavier approaches make more sense if the visiting logic needs to be reused in other places.

If we don’t need to share state across visitations of objects in a collection, we could simply use extension methods, which is the simplest option of all.  After all, the original intent of the Visitor pattern was to allow us to add functionality to a class without modifying the original element’s type.  According to dofactory.com:

Represent an operation to be performed on the elements of an object structure. Visitor lets you define a new operation without changing the classes of the elements on which it operates.

Extension methods “add” functionality to existing classes, or at least create a compelling illusion that they do.  Just reference an assembly and import the right namespace to add the operations.  It is possible to share state, but only by doing something like passing in some shared state object to each call.

If C# had the ability to define a variable that would be static to a defined closure in a method (which it doesn’t), we could use extension methods all the time without any drawbacks.  (I miss the static local variable feature of Turbo Pascal.)

Conclusion

With the use of lambda expressions and extension methods, we’ve been able to cut the amount of code for the Visitor pattern by more than half and found that there was no need to specially prepare the data model classes to support visitation.  While the classic Visitor pattern may have more potential in complex and custom Accept scenarios, in general the need to visit elements of a collection can be better accomplished with the judicious use of available language features in C# than by blindly following classic design patterns without consideration for how relevant they really are.

While I certainly encourage developers to become familiar with common patterns, I also encourage them to think carefully about the code they’re writing, and to ask themselves if it’s as clear and simple as it can be.  As software systems grow in size–sometimes becoming victims of their own success–small inefficiencies and muddied designs can snowball into unmanageability.  Apply a simple solution first, and then add complexity only when necessary.

Doubt and ask why constantly.  Be educated and familiar with the literature, but don’t dogmatically accept everything you read: think for yourself and hone your skills at every opportunity.  While the goals and forces in software tend to remain constant over time, the forms that made sense years ago may become unnecessary with today’s tools.

Posted in Algorithms, Data Structures, Design Patterns, Object Oriented Design, Problem Modeling, Software Architecture | 10 Comments »

Tree Data Structure Source Code Posted

Posted by Dan Vanderboom on April 3, 2008

[Updated 8/14/2014] The source code for this library can be found here at GitHub. Also check out my blog post announcing it.

For those of you who have been waiting for the source code to the Tree<T> article I wrote, I’m happy to announce that it’s finally available.  I had originally intended to supply some non-trivial examples of its use, but with my super busy schedule at work and otherwise, I’ve had to reduce the scope just to get it out there.  One of my examples is large enough to warrant one or more large articles, so I also didn’t want to just toss them in there without sufficient explanation as to how it all works.

While I work on getting those ready, check out Damon Payne’s article on using a non-binary tree for modeling dependencies among concurrently-running tasks with the Task Parallel Library.  This is a great use of the tree data structure.  It would be interesting to see how that would work for tasks with cross-branch dependencies.  Does the tree become a graph?  Would iteration become a garbage-collection-like traversal?  How would it need to respond to insertion of tasks or modification or dependencies during execution?

My non-binary tree article, which has been updated with a short section just before the conclusion, can be found here.  At the very top of the article is a link to the source code in the form of a Visual Studio 2008 solution.  For those of you with VS2005, you should be able to easily extract the code and create a VS2005 project out of it.  I specifically targeted .NET 2.0 instead of 3.5, and originally tested it against Compact Framework.

I’ll also be doing further development of this library, so if you have any cool feature requests, let me know.

Posted in Algorithms, Concurrency, Data Structures, My Software, Object Oriented Design, Problem Modeling, Software Architecture | 6 Comments »

Tree<T>: Implementing a Non-Binary Tree in C#

Posted by Dan Vanderboom on March 15, 2008

[Updated 8/14/2014] The source code for this library can be found here at GitHub. Also check out my blog post announcing it.

[This is the first article in a series of intelligent data structures, which is continued here with KeyedList.]

I’ve always thought it was odd that the .NET Framework never shipped with a Tree or Tree class in its collection namespaces.  Most of the other classic data structures are there: List, Dictionary, Stack, Queue, and so on.  Where then is Tree?  I have no idea, but finding myself in need of one, I decided to build one, and in doing so realized that it was a little trickier than I first imagined it would be.  Certainly it isn’t difficult to get some kind of tree working, even without support for generics, but building one that is truly intuitive to use was a real challenge.  In this article I will share what I came up with, and will illustrate the thought process behind each of the design decisions that I made as a study in object-oriented design and the use of some cool C# language features like generics constraints.

First, what is a tree data structure and what is it used for?  A tree is, in layman’s terms, a collection of nodes (items) that are connected to each other in such a way that no cycles (circular relationships) are allowed.  Examples abound in organization charts, family trees, biological taxonomies, and so forth.  Trees are incredibly useful structures because they allow us to represent hierarchies and to compose complex structures out of basic parts and simple relationships.  In the world of programming specifically, this provides a mechanism for extensibility across many domains, and a container to assist us in searching, sorting, compressing, and otherwise processing data in efficient and sophisticated ways.  Trees are also a natural fit for use in the composite design pattern, where a group of objects can be treated (or processed) like a single object.

Just as it’s possible to implement recursive algorithms non-recursively, it’s also possible to create non-hierarchical data structures to store data that would more logically be stored in a tree.  In these cases, your code is typically much more clear when organized in a tree, leading to greater maintainability and refactorability.  Recursion itself, while not exceptionally common in every day programming tasks, plays a more important role in some interesting ways with trees and traversal patterns.  Traversal is the term used to describe how nodes in a tree are visited.  Later I’ll demonstrate some tree traversal techniques that you can use.

Binary vs. Non-Binary Trees

What are binary trees, and why the focus on non-binary trees?  Binary trees are a special kind of tree in which each node can only have two children, often referred to as left and right child nodes.  These special trees are very useful for sophisticated searching, sorting, and other algorithms.  However, trees that allow any number of children seem to abound in more general, every-day programming scenarios, as you’ll see from examples below.  In a follow-up article, I’ll demonstrate how you can create a BinaryTree class that inherits from Tree and applies the necessary restrictions and some useful abstractions.  Below (figure 1) is a binary tree, which I present here to contrast with the non-binary trees that we’ll be covering in detail.

BinaryTree

Figure 1. A generic binary tree.

For more information on implementing and using binary trees, see this article which is part of a larger series on data structures.

Examples of Non-Binary Trees

A simple example of a non-binary tree is the file system in any modern operating system.  The diagram below (figure 2) illustrates a small part of the structure of the Windows file system.  Folders can be nested inside other folders as deeply as needed, providing the ability to compose a complex organizational structure for storing files.  The traversal path—from the root to some other node—can be represented by a canonical resource descriptor such as C:\Windows\system32.

FileSystemTree

Figure 2. The file system is a tree structure of folders containing other folders.

This hierarchical structure can be represented and visualized in different ways, though the underlying relationships are the same.  The screenshot below (figure 3) is of Windows Explorer on my development computer.  The control is the common TreeView, which supplies a way in this case for users to explore and interact with the tree data structure of the file system.

FileSystemTreeView

Figure 3. A tree view control in Windows Explorer provides access to a hierarchy of nested folders.

Another example is the organization of controls in a form (for Windows Forms) or in a window (for WPF).  The next diagram (figure 4) depicts the relationship of controls containing child controls, which may in turn contain their own children, and so on.

AppWindowTree2

Figure 4. A user interface is composed of a tree of controls that contain other controls.

This is also manifested through a user interface with the Document Outline window in Visual Studio, which is very useful for selecting deeply nested controls, or container controls that are otherwise difficult to select in the forms designer itself.  This is shown in figure 5, where you can clearly see the different levels of all controls.

DocumentOutline

Figure 5. The document outline window in Visual Studio for a Windows Forms screen.

Defining the Basic Components

There is a lot of formal terminology in graph theory to contend with, but for our purposes, we only really need to be concerned with a few basic terms.

  • Tree – This refers to a collection of nodes connected by parent-child relationships in a hierarchical structure.
  • Parent – A node one level up from the current node.
  • Child – A node one level down from the current node.
  • Node – One item in a tree with an optional parent and zero or more children.
  • Root Node – A special node with no parent.  A tree can have only one root node.
  • Subtree – A section of a larger tree including the non-root node of a larger tree, plus all of its children.

That’s not so bad, is it?  In designing a reusable tree data structure, it’s important to establish a consistent and sensible pattern for semantics.  In fact, coming up with good names for the components of our data structure may be the most difficult part.  With poor identifiers, even simple structures can be confusing to those using it.

Begin with the End in Mind

I began with a quasi-test-driven development approach.  I asked myself how I’d like the end result to look when consuming the new data structure.  I knew from the start that a tree data structure that didn’t use generics wasn’t going to be very useful.  Imagine this code:

Tree ObjectTree;

What is the type of each node in the tree?  If I’m going to use this in different scenarios, the only answer is to make it an object, so that I could store whatever I liked in my tree, but this would require a lot of casting and a lack of type checking at compile type.  So instead, using generics, I could define something like this:

Tree <string> StringTree;

This is much better, and leads to two more steps that we’ll need to take conceptually.  The first one is that I will definitely want a tree of nodes for custom types in my software’s problem domain, perhaps Customer or MobileDevice objects.  Like strings, these objects are (for our purposes here) simply dumb containers of data in the sense that they are unaware of the tree structure in which they reside.  If we take this one level further, and consider custom types that are aware of their place within the tree (and can therefore particpate in much richer ways to compose hierarchical algorithms), we’ll need to consider how to make that awareness happen.  I will explain this in more detail later in this article.

public class Tree : TreeNode

{

    public Tree() { }

    public Tree(T RootValue)

    {

        Value = RootValue;

    }

}

This definition of Tree is really a matter of semantics and syntax preference.  I’m creating an alias for the TreeNode type, claiming that a Tree is a node itself—the root node in a Tree, by convention.  I call this a synonym type.  Here’s a very simple example of its use:

Tree<string> root = new Tree<string>();

root.Value = “zero”;

int d0 = zero.Depth;

TreeNode<string> one = zero.Children.Add(“one”);

int d1 = one.Depth;

TreeNode<string> twoa = one.Children.Add(“two-a”);

TreeNode<string> twob = one.Children.Add(“two-b”);

TreeNode<string> twoc = one.Children.Add(“two-c”);

string twocstr = twoc.Value;

int

d2 = two.Depth;

You can tell a few things by looking at this code:

  • The root node is defined as a Tree, but is manipulated like the other TreeNodes because it inherits from TreeNode.
  • Each node’s value is stored in a property called Value, which has a type of T (using generics).
  • A Depth property indicates how deeply in the tree the node is nested (the root has a depth of 0).
  • The Add method in the Children TreeNode collection returns a TreeNode object, making it easier to create a new node and get a handle to it in the same statement.

Connecting Nodes to Their Parents & Children

The intelligence needed to make a Tree work resides in two classes: TreeNode and TreeNodeList.  I could have used a standard collection type for TreeNodeList, but each TreeNode links to both Parent and Children nodes, which are two fundamentally different relationships; and I wanted parent and child nodes to connect and disconnect automatically behind-the-scenes—if you add a child to X, that child’s Parent property should automatically be set to X, and vice versa.  That requires hooking into the Add method in a custom collection class to manage those relationships.  TreeNodeList therefore looks like this:

public class TreeNodeList : List<TreeNode>

{

    public TreeNode Parent;

    public TreeNodeList(TreeNode Parent)

    {

        this.Parent = Parent;

    }

    public new TreeNode Add(TreeNode Node)

    {

        base.Add(Node);

        Node.Parent = Parent;

        return Node;

    }

    public TreeNode Add(T Value)

    {

        return Add(new TreeNode(Value));

    }

    public override string ToString()

    {

        return “Count=” + Count.ToString();

    }

}

The ToString override is very important for making your debugging sessions more manageable.  Without this kind of assistance, especially when troubleshooting recursive, hierarchical algorithms, you may go crazy digging through the debugger’s own tree view control to find what you’re looking for.

The Tree Node

The TreeNode itself gets a little tricky.  If you update the Parent property because a node is going to be moved to another part of the tree, for example, you want to make sure that it gets removed from the Parent’s list of Children, and you also want to add it as a child to the new Parent.  If the Parent was null, or is being set to null, then only one of those operations (remove or add) is necessary.

Here are the primary structural and payload-containing portions of this early version of the TreeNode class:

public class TreeNode : IDisposable

{

    private TreeNode _Parent;

    public TreeNode Parent

    {

        get { return _Parent; }

        set

        {

            if (value == _Parent)

            {

                return;

            }

            if (_Parent != null)

            {

                _Parent.Children.Remove(this);

            }

            if (value != null && !value.Children.Contains(this))

            {

                value.Children.Add(this);

            }

            _Parent = value;

        }

    }

    public TreeNode Root

    {

        get

        {

            //return (Parent == null) ? this : Parent.Root;

            TreeNode node = this;

            while (node.Parent != null)

            {

                node = node.Parent;

            }

            return node;

        }

    }

    private TreeNodeList _Children;

    public TreeNodeList Children

    {

        get { return _Children; }

        private set { _Children = value; }

    }

    private T _Value;

    public T Value

    {

        get { return _Value; }

        set

        {

            _Value = value;

            if (_Value != null && _Value is ITreeNodeAware)

            {

                (_Value as ITreeNodeAware).Node = this;

            }

        }

    }

}

There are two ways we could find the Root node.  The commented line shows a succinct way to walk the tree (recursively) toward successive parents until Parent is null.  The actual implementation being used shows another way, using a simple while loop.  I prefer this because in the event of debugging, it’s easy to step through a loop, and a little more difficult to jump through the same property on multiple, perhaps many, different instances of TreeNode.  I follow the same pattern for the Depth property (below).

A tree structure isn’t very useful, however, unless it can carry some kind of payload.  You want to build a tree of something, and it’s handy that we can use generics to tell the compiler that we want a Tree of strings (Tree), for example.  That’s what the Value property is for, and why its type is the generic type parameter T.

To instantiate and initialize TreeNodes, you’ll need some constructors.  Here are two of them I defined:

public TreeNode(T Value)

{

    this.Value = Value;

    Parent = null;

    Children = new TreeNodeList(this);

}

 

public TreeNode(T Value, TreeNode Parent)

{

    this.Value = Value;

    this.Parent = Parent;

    Children = new TreeNodeList(this);

}

The Tree Node Payload’s Awareness of its Place in the Tree

You probably noticed in the last section that the Value object is checked to see if it implements the ITreeNodeAware interface.  This is an optional extensibility mechanism for custom classes that need to be aware of the tree so that payload objects can read or manipulate it in some way.  In developing a data binding framework for Windows Forms that allows you to bind control properties to paths (“PurchaseOrder.Customer.Name”) instead of specific objects (PurchaseOrder.Customer, “Name”), as ASP.NET and WPF data binding works, I needed this ability and came to the conclusion that this would be a useful feature in general.  Later in the article, I will magically transform the TreeNode and TreeNodeList classes in such a way that both this interface and the Value property become unnecessary.

Until then, here’s how the interface looks with an example class that uses it.

public interface ITreeNodeAware

{

    TreeNode Node { get; set; }

}

public class Task : ITreeNodeAware<Task>

{

    public bool Complete = false;

    private TreeNode<Task> _Node;

    public TreeNode<Task> Node

    {

        get { return _Node; }

        set

        {

            _Node = value;

            // do something when the Node changes

            // if non-null, maybe we can do some setup

        }

    }

    // recursive

    public void MarkComplete()

    {

        // mark all children, and their children, etc., complete

        foreach (TreeNode<Task> ChildTreeNode in Node.Children)

        {

            ChildTreeNode.Value.MarkComplete();

        }

        // now that all decendents are complete, mark this task complete

        Complete = true;

    }

}

Using the ITreeNodeAware interface means we have another step to make in our implementation, and adds some complexity to its use in terms of discoverability and implementation of the interface by consumers of the Tree structure in creating custom payload classes.  By doing this, however, our Task objects will get injected with a Node property value when added to a Tree of Tasks.  So the payload object will point to the node via the Node property, and the Node will point to payload object via its Value property.  This is a lot of logic for such a simple relationship, but as we’ll see later, there is an elegant way around all of this.

Including Structure Helper Members

There are some common measurements of aspects of the tree’s nodes, as well as operations that you will typically want to perform on a tree or subtree, such as initialization, systematic traversal, pruning and grafting, disposal, and determination of depth, some of which I will discuss here.

Here is a property to determine your current nesting depth, which can be useful while debugging:

public int Depth

{

    get

    {

        //return (Parent == null ? -1 : Parent.Depth) + 1;

        int depth = 0;

        TreeNode node = this;

        while (node.Parent != null)

        {

            node = node.Parent;

            depth++;

        }

        return depth;

    }

}

Because the payload objects (referenced by the Value property) may require disposing, the tree nodes (and therefore the tree as a whole) is IDisposable.  Different trees of objects may require being disposed in different orders, so I’ve created a TreeTraversalType, a DisposeTraversal property of this type to specify the order, and have implemented the Dispose method that takes this into consideration.

public enum TreeTraversalType

{

    TopDown,

    BottomUp

}

private TreeTraversalType _DisposeTraversal = TreeTraversalType.BottomUp;

public TreeTraversalType DisposeTraversal

{

    get { return _DisposeTraversal; }

    set { _DisposeTraversal = value; }

}

Here is one way to implement IDisposable that includes a property indicating whether a node has been disposed, invokes a Disposing event, and traverses the tree according to the value of DisposeTraversal.

private bool _IsDisposed;

public bool IsDisposed

{

    get { return _IsDisposed; }

}

public void Dispose()

{

    CheckDisposed();

    OnDisposing();

    // clean up contained objects (in Value property)

    if (Value is IDisposable)

    {

        if (DisposeTraversal == TreeTraversalType.BottomUp)

        {

            foreach (TreeNode node in Children)

            {

                node.Dispose();

            }

        }

        (Value as IDisposable).Dispose();

        if (DisposeTraversal == TreeTraversalType.TopDown)

        {

            foreach (TreeNode node in Children)

            {

                node.Dispose();

            }

        }

    }

    _IsDisposed = true;

}

public event EventHandler Disposing;

protected void OnDisposing()

{

    if (Disposing != null)

    {

        Disposing(this, EventArgs.Empty);

    }

}

public void CheckDisposed()

{

    if (IsDisposed)

    {

        throw new ObjectDisposedException(GetType().Name);

    }

}

I overrode ToString in the TreeNodeList class above to display the count of children.  I do the same thing for TreeNode, which as I mentioned earlier aids a great deal in debugging.

public override string ToString()

{

    string Description = string.Empty;

    if (Value != null)

    {

        Description = “[“ + Value.ToString() + “] “;

    }

    return Description + “Depth=” + Depth.ToString() + “, Children=”

      + Children.Count.ToString();

}

Notice how the Value property, if it’s set, gets included in the ToString value.  If you’re looking at a TreeNode in the watch window, you’ll appreciate that your Value object can be represented without having to drill into anything.  You can see at a glance what the payload is, how deep it is in the tree, and how many child nodes it has.

The Role of Generics in Representing Payloads

I have already supplied enough logic and sample code for a fully functioning tree data structure for use in the .NET Framework, which incidentally was developed in a Compact Framework project and will therefore work in any .NET environment.  That being said, there are some syntactical inconveniences with the approach mentioned so far.  Consider the Tree example above in a scenario where we want to access a parent node’s payload from a current node:

TreeNode<Task> CurrentTaskNode = /* get current task node */;

bool IsComplete = CurrentTaskNode.Parent.Value.Complete;

Note that Parent doesn’t return our object of type T, but instead gives us a TreeNode; and our CurrentTaskNode object is also a node object and not a task object.  When we think about trees in theory, especially visually in the form of diagrams, the parent of a node is another node, meaning that parent and child are the same type of thing.  In our simple implementation so far, however, the parent of a task is not another task, but rather a task-tree-node, which is not the same thing.

If we start from a Task object instead of a TreeNode, the syntax is worse:

Task CurrentTask = /* get current task */;

bool IsComplete = CurrentTask.Node.Parent.Value.Complete;

Notice how each time we access Node or Value, we’re weaving in and out between types.  So we must manipulate this data structure and its payload all the while with explicit deference to the type disparity and dualism, and a careful naming of variables is important to avoid confusion between types (CurrentTaskNode vs. CurrentTask).  When I first had this realization, it occurred to me why a Tree data type may have been missing from the original .NET Framework.  I doubt very much that all the brilliant developers working on the base class libraries didn’t think to include such a useful structure, but perhaps the obvious implementations that came to mind seemed confusing and problematic for real-world use.

Fortunately, we now have capabilities in the CLR as of 2.0—and corresponding language features—that enable us to solve this problem elegantly.  I’m referring to generics, and more specifically, to generic constraints.

The simplifying idea is that by creating a payload class that inherits from TreeNode, and applying a simple generic constraint to get around some casting problems that would normally prevent us from compiling, we can make our final syntax look like this:

Task MakeDinner = new Task();

Task PrepareIngredients = MakeDinner.Children.Add(new Task());

Task CookMeal = MakeDinner.Children.Add(new Task());

Task PreheatOven = CookMeal.Children.Add(new Task());

Task BakeAt350 = CookMeal.Children.Add(new Task());

Task Cleanup = MakeDinner.Children.Add(new Task());

bool IsAllDone = BakeAt350.Parent.Parent.Complete;

Notice that in the final statement, we don’t have to navigate from BakeAt350 to some different type object through a Node property, and that we can go directly from Parent to the Complete property.  This is because our Task class is defined like this now:

public class Task : TreeNode<Task>

{

    public bool Complete = false;

 

    // recursive

    public void MarkComplete()

    {

        // mark all children, and their children, etc., complete

        foreach (Task Child in Children)

        {

            Child.MarkComplete();

        }

 

        // now that all decendents are complete, mark this task complete

        Complete = true;

    }

}

It’s been compressed!  The Node property is no longer necessary (since the task is-a node, instead of being contained-by a node), and therefore the ITreeNodeAware interface can also be dispensed with.  The MarkComplete method is different, too: we simply write Child.MarkComplete instead of Child.Value.MarkComplete, and we can loop through a collection of Task objects with the Children property directly, instead of going through a set of TreeNode objects in some external Node object.

In our consuming code above, we didn’t even have to mention the Tree or TreeNode type, but if we did declare a variable that way (for the sake of clarity), it would simply be a synonym of Task.  It would look like this:

Tree<Task> MakeDinner = new Tree<Task>();

We could use TreeNode as well; it makes no difference.  Task = Tree = TreeNode.

To make all of this magic happen, we need to define TreeNode with a generics constraint.

public class TreeNode : IDisposable where T : TreeNode

This tells the compiler that T must inherit from TreeNode, and therefore variables of type T are safe to manipulate as tree nodes.  TreeNodeList requires the same constraint:

public class TreeNodeList : List<TreeNode> where T : TreeNode

This breaks us out of a restrictive pattern where normally a custom collection class is unable to richly manipulate the objects within it (unless the collection is instantiated directly from consumer code, which is not the case in this pattern).  But because the collection class knows that all of its items will derive from TreeNode, it can manipulate them as tree nodes with no problem.

There is a price to pay for this convenience, however, which is that the tree is restricted to types that inherit from TreeNode.  This means that a Tree is no longer possible.  You will have to decide how you will likely use trees in your development to determine whether to design something like this into it.  If you want this cleaner syntax but still need trees of primitive and other non-TreeNode-derived types, you can create a distinct SimpleTree for this purpose.  It may even be possible for Tree to inherit from SimpleTree, hiding the Value property (with a private shadow method shown below) and adding the generic constraint.

Cleaning Up

Now that the basic functionality is in place, I decided to split the Tree class into two.  SimpleTree represents a simple tree data structure that can contain as its Value any object; and ComplexTree, which uses the generic constraint described above and supports more complex hierarchical algorithms and tree-aware nodes.  I really like the simplicity of the Tree name, but along with the need to support both simple and complex scenarios, there are two more reasons for this name-change decision.

First, in the System.Windows.Forms namespace, there is already a TreeNode class that corresponds to the TreeView control.  If I had designed that control, I probably would have named it VisualTree and its node VisualTreeNode to distinguish it from a logical node, but as it is, dealing with two different TreeNode classes, even in different namespaces, could be confusing and messy.  Second, the new Task Parallel Library (TPL) contains an implementation of a binary tree called Tree, which is a rather short-sighted name considering that all useful trees are not binary trees, as I’ve demonstrated in this article; BinaryTree would have been a much more appropriate name.  Hopefully by the time TPL is released, this identifier will be updated to reflect this.

Conclusion

Elegant implementations of tree data structures in .NET Framework, though problematic in the past, are finally possible with the introduction of generics and generic constraints.  With some careful syntax planning for consuming code, as well as experimentation with different language constructs, I hope I have shed some light on how these kinds of problems can be approached and solved.

In future articles, I will be exploring some techniques and specific uses for hierarchical algorithms.  Some of the very problems that appear too difficult to solve without a background in mathmatics can be much more easily understood and mastered by using tree data structures and associated visualization techniques.

[This is the first article in a series of intelligent data structures, which is continued here with KeyedList.]

Posted in Algorithms, Data Structures, Object Oriented Design, Problem Modeling, Software Architecture | 119 Comments »