Critical Development

Enterprise modeling, design, development, languages, and tools.

Archive for the ‘Data Structures’ Category

Better Tool Support for .NET

Posted by Dan Vanderboom on September 7, 2009

Productivity Enhancing Tools

Visual Studio has come a long way since its debut in 2002.  With the imminent release of 2010, we’ll see a desperately-needed overhauling of the archaic COM extensibility mechanisms (to support the Managed Package Framework, as well as MEF, the Managed Extensibility Framework) and a redesign of the user interface in WPF that I’ve been pushing for and predicted as inevitable quite some time ago.

For many alpha geeks, the Visual Studio environment has been extended with excellent third-party, productivity-enhancing tools such as CodeRush and Resharper.  I personally feel that the Visual Studio IDE team has been slacking in this area, providing only very weak support for refactorings, code navigation, and better Intellisense.  While I understand their desire to avoid stepping on partners’ toes, this is one area I think makes sense for them to be deeply invested in.  In fact, I think a new charter for a Developer Productivity Team is warranted (or an expansion of their team if it already exists).

It’s unfortunately a minority of .NET developers who know about and use these third-party tools, and the .NET community as a whole would without a doubt be significantly more productive if these tools were installed in the IDE from day one.  It would also help to overcome resistance from development departments in larger organizations that are wary of third-party plug-ins, due perhaps to the unstable nature of many of them.  Microsoft should consider purchasing one or both of them, or paying a licensing fee to include them in every copy of Visual Studio.  Doing so, in my opinion, would make them heroes in the eyes of the overwhelming majority of .NET developers around the world.

It’s not that I mind paying a few hundred dollars for these tools.  Far from it!  The tools pay for themselves very quickly in time saved.  The point is to make them ubiquitous: to make high-productivity coding a standard of .NET development instead of a nice add-on that is only sometimes accepted.

Consider just from the perspective of watching speakers at conferences coding up samples.  How many of them don’t use such a tool in their demonstration simply because they don’t want to confuse their audience with an unfamiliar development interface?  How many more demonstrations could they be completing in the limited time they have available if they felt more comfortable using these tools in front of the masses?  You know you pay good money to attend these conferences.  Wouldn’t you like to cover significantly more ground while you’re there?  This is only likely to happen when the tool’s delivery vehicle is Visual Studio itself.  Damon Payne makes a similar case for the inclusion of the Managed Extensibility Framework in .NET Framework 4.0: build it into the core and people will accept it.

The Gorillas in the Room

CodeRush and Resharper have both received recent mention in the Hanselminutes podcast (episode 196 with Mark Miller) and in the Deep Fried Bytes podcast (episode 35 with Corey Haines).  If you haven’t heard of CodeRush, I recommend watching these videos on their use.

For secondary information on CodeRush, DXCore, and the principles with which they were designed, I recommend these episodes of DotNetRocks:

I don’t mean to be so biased toward CodeRush, but this is the tool I’m personally familiar with, has a broader range of functionality, and it seems to get the majority of press coverage.  However, those who do talk about Resharper do speak highly of it, so I recommend you check out both of them to see which one works best for you.  But above all: go check them out!

Refactor – Rename

Refactoring code is something we should all be doing constantly to avoid the accumulation of technical debt as software projects and the requirements on which they are based evolve.  There are many refactorings in Visual Studio for C#, and many more in third-party tools for several languages, but I’m going to focus here on what I consider to be the most important refactoring of them all: Rename.

Why is Rename so important?  Because it’s so commonly used, and it has such far-reaching effects.  It is frequently the case that we give poor names to identifiers before we clearly understand their role in the “finished” system, and even more frequent that an item’s role changes as the software evolves.  Failure to rename items to accurately reflect their current purpose is a recipe for code rot and greater code maintenance costs, developer confusion, and therefore buggy logic (with its associated support costs).

When I rename an identifier with a refactoring tool, all of the references to that identifier are also updated.  There might be hundreds of references.  In the days before refactoring tools, one would accomplish this with Find-and-Replace, but this is dangerous.  Even with options like “match case” and “match whole word”, it’s easy to rename the wrong identifiers, rename pieces of string literals, and so on; and if you forget to set these options, it’s worse.  You can go through each change individually, but that can take a very long time with hundreds of potential updates and is a far cry from a truly intelligent update.

Ultimately, the intelligence of the Rename refactoring provides safety and confidence for making far-reaching changes, encouraging more aggressive refactoring practices on a more regular basis.

Abolishing Magic Strings

I am intensely passionate about any tool or coding practice that encourages refactoring and better code hygiene.  One example of such a coding practice is the use of lambda expressions to select identifiers instead of using evil “magical strings”.  From my article on dynamically sorting Linq queries, the use of “magic strings” would force me to write something like this to dynamically sort a Linq query:

Customers = Customers.Order("LastName").Order("FirstName", SortDirection.Descending);

The problem here is that “LastName” and “FirstName” are oblivious to the Rename refactoring.  Using the refactoring tool might give me a false sense of security in thinking that all of my references to those two fields have been renamed, leading me to The Pit of Despair.  Instead, I can define a function and use it like the following:

public static IOrderedEnumerable<T> Order<T>(this IEnumerable<T> Source,
    Expression<Func<T, object>> Selector, SortDirection SortDirection)
{
    return Order(Source, (Selector.Body as MemberExpression).Member.Name, SortDirection);
}

Customers = Customers.Order(c => c.LastName).Order(c => c.FirstName, SortDirection.Descending);

This requires a little understanding of the structure of expressions to implement, but the benefit is huge: I can now use the refactoring tool with much greater confidence that I’m not introducing subtle reference bugs into my code.  For such a simple example, the benefit is dubious, but multiply this by hundreds or thousands of magic string references, and the effort involved in refactoring quickly becomes overwhelming.

Coding in this style is most valuable when it’s a solution-wide convention.  So long as you have code that strays from this design philosophy, you’ll find yourself grumbling and reaching for the inefficient and inelegant Find-and-Replace tool.  The only time it really becomes an issue, then, is when accessing libraries that you have no control over, such as the Linq-to-Entities and the Entity Framework, which makes extensive use of magic strings.  In the case of EF, this is mitigated somewhat by your ability to regenerate the code it uses.  In other libraries, it may be possible to write extension methods like the Order method shown above.

It’s my earnest hope that library and framework authors such as the .NET Framework team will seriously consider alternatives to, and an abolition of, “magic strings” and other coding practices that frustrate otherwise-powerful refactoring tools.

Refactoring Across Languages

A tool is only as valuable as it is practical.  The Rename refactoring is more valuable when coding practices don’t frustrate it, as explained above.  Another barrier to the practical use of this tool is the prevalence of multiple languages within and across projects in a Visual Studio solution.  The definition of a project as a single-language container is dubious when you consider that a C# or VB.NET project may also contain HTML, ASP.NET, XAML, or configuration XML markup.  These are all languages with their own parsers and other language services.

So what happens when identifiers are shared across languages and a Rename refactoring is executed?  It depends on the languages involved, unfortunately.

When refactoring a C# class in Visual Studio, the XAML’s x:Class value is also updated.  What we’re seeing here is cross-language refactoring, but unfortunately it only works in one direction.  There is no refactor command to update the x:Class value from the XAML editor, so manually changing it causes my C# class to become sadly out of sync.  Furthermore, this seems to be XAML specific.  If I refactor the name of an .aspx.cs class, the Inherits attribute of the Page directive in the .aspx file doesn’t update.

How frequent do you think it is that someone would want to change a code-behind file for an ASP.NET page, and yet would not want to change the Inherits attribute?  Probably not very common (okay, probably NEVER).  This is a matter of having sensible defaults.  When you change an identifier name in this way, the development environment does not respond in a sensible way by default, forcing the developer to do extra work and waste time.  This is a failure in UI design for the same reason that Intellisense has been such a resounding success: Intellisense anticipates our needs and works with us; the failure to keep identifiers in sync by default is diametrically opposed to this intelligence.  This represents a fragmented and inconsistent design for an IDE to possess, thus my hope that it will be addressed in the near future.

The problem should be recognized as systemic, however, and addressed in a generalized way.  Making individual improvements in the relationships between pairs of languages has been almost adequate, but I think it would behoove us to take a step back and take a look at the future family of languages supported by the IDE, and the circumstances that will quickly be upon us with Microsoft’s Oslo platform, which enables developers to more easily build tool-supported languages (especially DSLs, Domain Specific Languages). 

Even without Oslo, we have seen a proliferation of languages: IronRuby, IronPython, F#, and the list goes on.  A refactoring tool that is hard-coded for specific languages will be unable to keep pace with the growing family of .NET and markup languages, and certainly unable to deal with the demands of every DSL that emerges in the next few years.  If instead we had a way to identify our code identifiers to the refactoring tool, and indicate how they should be bound to identifiers in other languages in other files, or even other projects or solutions, the tools would be able to make some intelligent decisions without understanding each language ahead of time.  Each language’s language service could supply this information.  For more information on Microsoft Oslo and its relationship to a world of many languages, see my article on Why Oslo Is Important.

Without this cross-language identifier binding feature, we’ll remain in refactoring hell.  I offered a feature suggestion to the Oslo team regarding this multi-master synchronization of a model across languages that was rejected, much to my dismay.  I’m not sure if the Oslo team is the right group to address this, or if it’s more appropriate for the Visual Studio IDE team, so I’m not willing to give up on this yet.

A Default of Refactor-Rename

The next idea I’d like to propose here is that the Rename refactoring is, in fact, a sensible default behavior.  In other words, when I edit an identifier in my code, I more often than not want all of the references to that identifier to change as well.  This is based on my experience in invoking the refactoring explicitly countless times, compared to the relatively few times I want to “break away” that identifier from all the code that references.

Think about it: if you have 150 references to variable Foo, and you change Foo to FooBar, you’re going to have 150 broken references.  Are you going to create a new Foo variable to replace them?  That workflow doesn’t make any sense.  Why not just start editing the identifier and have the references update themselves implicitly?  If you want to be aware of the change, it would be trivial for the IDE to indicate the number of references that were updated behind the scenes.  Then, if for some reason you really did want to break the references, you could explicitly launch a refactoring tool to “break references”, allowing you to edit that identifier definition separately.

The challenge that comes to mind with this default behavior concerns code that spans across solutions that aren’t loaded into the IDE at the same time.  In principle, this could be dealt with by logging the refactoring somewhere accessible to all solutions involved, in a location they can all access and which gets checked into source control.  The next time the other solutions are loaded, the log is loaded and the identifiers are renamed as specified.

Language Property Paths

If you’ve done much development with Silverlight or WPF, you’ve probably run into the PropertyPath class when using data binding or animation.  PropertyPath objects represent a traversal path to a property such as “Company.CompanyName.Text”.  The travesty is that they’re always “magic strings”.

My argument is that the property path is such an important construct that it deserves to be an core part of language syntax instead of just a type in some UI-platform-specific library.  I created a data binding library for Windows Forms for which I created my own property path syntax and type, and there are countless non-UI scenarios in which this construct would also be incredibly useful.

The advantage of having a language like C# understand property path syntax is that you avoid a whole class of problems that developers have used “magic strings” to solve.  The compiler can then make intelligent decisions about the correctness of paths, and errors can be identified very early in the cycle.

Imagine being able to pass property paths to methods or return then from functions as first-class citizens.  Instead of writing this:

Binding NameTextBinding = new Binding("Name") { Source = customer1; }

… we could write something like this, have access to the Rename refactoring, and even get Intellisense support when hitting the dot (.) operator:

Binding NameTextBinding = new Binding(@Customer.Name) { Source = customer1; }

In this code example, I use the fictitious @ operator to inform the compiler that I’m specifying a property path and not trying to reference a static property called Name on the Customer class.

With property paths in the language, we could solve our dynamic Linq sort problem cleanly, without using lambda expressions to hack around the problem:

Customers = Customers.Order(@Customer.LastName).Order(@Customer.FirstName, SortDirection.Descending);

That looks and feels right to me.  How about you?

Summary

There are many factors of developer productivity, and I’ve established refactoring as one of them.  In this article I discussed tooling and coding practices that support or frustrate refactoring.  We took a deep look into the most important refactoring we have at our disposal, Rename, and examined how to get the greatest value out of it in terms of personal habits, as well as long-term tooling vision and language innovation.  I proposed including property paths in language syntax due to its general usefulness and its ability to solve a whole class of problems that have traditionally been solved using problematic “magic strings”.

It gives me hope to see the growing popularity of Fluent Interfaces and the use of lambda expressions to provide coding conventions that can be verified by the compiler, and a growing community of bloggers (such as here and here) writing about the abolition of “magic strings” in their code.  We can only hope that Microsoft program managers, architects, and developers on the Visual Studio and .NET Framework teams are listening.

Posted in Data Binding, Data Structures, Design Patterns, Development Environment, Dynamic Programming, Functional Programming, LINQ, Language Innovation, Oslo, Silverlight, Software Architecture, User Interface Design, Visual Studio, Visual Studio Extensibility, Windows Forms | Leave a Comment »

Why Oslo is Important

Posted by Dan Vanderboom on January 17, 2009

imageContrary to common misunderstanding and speculation, the point of Oslo is not to put programming in the hands of business analysts who want to write their own business rules.  Do I think some of that will happen?  Architects and engineers will try everything they can imagine.  Some of them will succeed in specific niches or scenarios, but it won’t replace application or system design, and it will probably be very limited for the forseeable future.  Oslo is more about dramatically improving the productivity of designers and developers by generalizing common solution patterns and generating more adaptable tools.

PDC Keynote

Much of the confusion around Oslo occurs for two reasons:

  1. Oslo is designed at a higher level of abstraction than most systems today, so its scope is broad and it will have an impact on virtually every product, solution and service across Microsoft.  It’s difficult to get your head around something that big.
  2. Because of its abstract nature, core concepts are defined in terms that are heavily overloaded, like "Model", "Repository", and "Language".  Once you’ve picked up the lingo and can translate Oslo terminology into language you’re already familiar with, both the concept and magnitude of it will become obvious.

Oslo isn’t something completely new; in fact, Oslo borrows from a lot of previous research and even existing model-driven development tools.  Oslo focuses existing technologies and techniques into a coherent and mature vision of development, combining all parts into a more powerful whole, and promises to deliver a supremely adaptable and efficient platform to develop on.

What Is Oslo?

Oslo is a software factory for generating first-class, tool-supported languages out of your declarative specifications.

A factory is a highly organized production facility
that produces members of a product line
using standardized parts, tools and production processes.

-from a review of Software Factories

The product line is analogous to Oslo’s parsers, transform tools, and IDE plugins for new data models and languages (both textual and visual) that you define.  The standardized parts are Oslo’s library components; the tools are the M languages and the Quadrant/Intellipad application; and the processes are shaped by the flow of data through the Oslo tool chain (see the diagram near the end of this article).

With Oslo, you build the custom tools you need to rapidly build or generate software systems.  It’s all about using the right tool for the job, and having a say in how those tools are shaped to obtain the greatest leverage.

As stated at the home page of softwarefactories.com:

We see a capacity crisis looming. The industry continues to hand-stitch applications distributed over multiple platforms housed by multiple businesses located around the planet, automating business processes like health insurance claim processing and international currency arbitrage, using strings, integers and line by line conditional logic. Most developers build every application as though it is the first of its kind anywhere.

In other words, there’s already a huge shortage of experienced, highly-qualified professionals capable of ensuring the success of these increasingly complex systems, and with the need (and complexity) growing exponentially, our current development practices increasingly fall short of the total demand.

Books like Greenfield’s Software Factories have been advocating building at a higher level of abstraction for years, and my initial reaction was to see it as a natural, evolutionary milestone for a highly mature software system.  However, it’s an awful lot of focused development effort to attain such a level of maturity, and not many organizations are able to pull it off given the state of our current development platforms.

It’s therefore fortuitous that Microsoft teams have taken up the challenge of building these abilities into their .NET platform.  After all, that’s where it really belongs: in the framework.

Unexpected Awesomeness

Oslo of course contains a lot of expected awesomeness, but where it will probably have the most impact in terms of developer productivity is with new first-class languages and language tools.  Why?  It first helps to understand the world of data formats and languages.

We’ve had an explosion of data formats–these mini Domain Specific Languages, if you will (especially in the form of complex configuration files).  As systems evolve and scale, and the ways we can configure and compose our application’s behavior continues to grow, at what point do we perceive that configuration graph as the rich language that it becomes?  Or when our user interfaces evolve from Monolithic to Modular to Composite to Granular Composite (or User Composable), at what point does that persistent object graph become our UX DSL (as with XAML in WPF).

Sometimes we set our standards too low, or are slow to raise them when the time has come to do so.  With XML we get extensibility in defining languages and we think, "If we can parse it, then we can build a tool over it."  I don’t know about you, but I’d much rather work with rich client software–some kind of designer–over a textual data format any day.

But you know how things go: some company like Microsoft builds a whole bunch of cool stuff, driven off some XML configuration, or they unleash something like XAML on which WPF, WF, and more are built.  XAML is great for tools to read and write, and although XML and XAML are textual and not binary and therefore human readable in a text editor (the original intention behind that term), it’s simply not as easy to read as C# or VB.NET.  That’s why we aren’t all rushing to program everything in XAML.

Companies like Microsoft, building from the bottom up, release their platforms well in advance of the thick client user experiences that make them enjoyable to use and which encourages mass adoption.  Their models, frameworks, and applications are so large now that they’re released in massively differentiated stages, producing a technology adoption gap.

By giving that language a syntax other than XML, however, we can approach it in the same way we approach our program logic: in the most human readable and aesthetically-pleasant way we can devise, resembling our programming languages of choice.

Sometimes, the density of data and its structure in our model is such that a visual editor fails to represent that model well.  Source code is a case in point.  You could create a visual designer to visualize flow control, branching logic, and even complex expression building (like the iTunes Smart Playlist), but code in text format is more appropriate in this kind of scenario, and ends up being more efficient with the existing tooling available.  Especially with an IDE like Visual Studio, we’re working with human-millenia of effort that have gone into the great code editing tools we use today.  Oslo respects this need for choice by offering support for building both visual and textual DSLs, and recognizes the fluent definition of new formats and languages as the bridge to the next quantum leap in productivity.

If we had an easy way of defining languages in formats that we developers felt comfortable working with–as we’re comfortable with our general purpose languages and their rich tool support–then we’d be much more productive in the transition between a technology first being released and later having rich tool support over it.  WPF has taken quite a while to be adopted as much as it has, partly due to tool availability and maturity.  Before Expression Blend or Cider designers were released and hand-coding XAML was the only way, those who braved the angle brackets struggled with it.  As I play with Silverlight, I realize how much must still be done in XAML, and how we still struggle.  It’s simply not as nice to work with as my C# code.  Not as rich, and not as strongly tool-supported.

That’s one place Oslo provides value.  With the ability to define new textual and visual DSLs, rigorous verification and validation in a rich set of tools, the promise of Intellisense, colorization of keywords, operators, constants, and more, the Oslo architects recognize the ability to enhance our development experience in a language-agnostic way, raising the level of abstraction because, as they say, the way to solve any technical problem is to approach it at one higher level of indirection.  Unfortunately, this makes Oslo so generalized and abstract that it’s difficult to grasp and therefore to appreciate its immensity.  Once you can take a step back and see how it fits in holistically, you’ll see that it has the potential to dramatically transform the landscape of software development.

Currently, it’s a lot of work to implement all the language services in Visual Studio to give them as rich an experience as we’ve come to expect with C#, VB.NET, and others.  This is a serious impediment to doing this kind of work, so solving the problem at the level of Oslo drastically lowers the barrier to entry for implementing tool-supported languages.  The Oslo bits I’ve seen and played with are very early in the lifecycle for this massive scope of technology, but the more I think about its potential, the more impressed I am with the fundamental concept.  As Chris Anderson explained in his PDC session on MGrammar, MGrammar was an implementation detail, but sometime around June 2007, that feature team realized just how much customers wanted direct access to it and decided to release MGrammar to the world.

Modeling & The Repository

That’s all well and good for DSLs and language enthusiasts/geeks, but primarily perhaps, Oslo is about the creation, exploration, relation, and execution of models in an interoperable way.  In other words, all of the models that are currently used to describe a software system, or an entire IT environment, are either not encoded formally enough to verify or execute, or they’re encoded or stored in proprietary ways that don’t allow interoperability with other models.  A diagram in Visio or PowerPoint documenting network topology, for example, knows nothing about the component architecture or deployment model of the software systems installed and running on that network.

When people usually talk about models, they imagine high-level architecture documents, overviews used to visually summarize work that is much more granular in nature.  These models aren’t detailed, and they normally aren’t kept up to date and in sync with the current design as changes are made.  But modeling in Oslo is not an attempt to make these visual models contain all of the necessary detail, or to develop software with visual tools exclusively.  Oslo simply provides the tools, both graphical and textual, to define and relate many models.  It will be up to the development community to decide how all these tools are ultimately used, which parts of our systems will be specified in a mix of general purpose, domain specific, and visual languages.  Ultimately, Oslo will provide the material and glue to fill the gaps between the high and low level specifications, and unite them into a common, connected, and much more useful set of data.

To grasp what Oslo modeling is really all about requires that we expand our definition of "model", to see the models expressed in our configuration and XAML files, in our applications’ database schemas, in our entity classes, and so on.  As software grows in complexity and becomes more composable, we can use various languages to model its behavior, store that in the repository for runtime execution, inspection, or reuse by other systems.

This funny and clever Oslo video (reminiscent of The Hitchhiker’s Guide to the Galaxy) explains modeling in the broader sense alluded to here.

If we had some universal container for the storage of all different kinds of models, and a standardized way of relating entities across models, we’d be able to do things like impact analysis, where we could see the effect on software systems if someone were to alter the network it was running on; or powerful data mining on the IT execution environment of a business.

Many different tools, with different audiences, will be able to connect into this repository to manipulate aspects of the models that they understand and have access to.  This is just the tip of the iceberg.  We already model so much of what we do in the IT and software worlds, and as we begin adopting business process middleware and orchestration software like BizTalk, there’s a huge amount of value in those models converging and connecting.  That’s where the Oslo Repository comes in.

Oslo provides interoperability among models in the same way that SOA provides interoperability among services.  Not unlike the interoperability we have now among many different languages all sharing the same CLR specification.

Bridging data models across repositories or in shared repository is a major step forward.  With Windows Azure and Microsoft’s commitment to their online services platform (and considering the momentum of the SaaS movement with Amazon, Google, and others), shared storage and data sets are the future.  (Check out SQL Data Services if you haven’t already, and watch for some exciting announcements coming later this year!)

The Dichotomy of Data vs. Metadata

Jeff Pinkston from the Oslo team aptly reflects the attitude of the group when he scoffs at the categorical difference between data and metadata.  In terms of storing and querying it, serializing and communicating it, and everything else that matters in enterprise software, data is data and there’s no reason not to treat it the same when it comes to architecting a system.  We have our primary models and our secondary models, our shared models and our protected models, but they’re still just models that shape our software’s behavior, and they share all of the same characteristics when it comes to manipulation and access.  It’s their ultimate effect that differs.

It’s worth noting, I think, the line that’s been drawn between code and data in some programming languages and not in others (C# vs. LISP).  A division has been made for the sake of security rather than necessity.  Machine instruction codes are represented in the same sort of binary data and realized in the same digital circuitry as traditional user data.  It’s tempting to keep things locked down and divided, but as languages evolve to become more late bound and dynamic (and as the tools evolve to make this feasible), there will be more need for the manipulation of expression trees and ASTs.  I strongly suspect the lines will blur until they disappear.

Schema and Object Instance Languages

In order to define models, we need a tool.  In Oslo, this is a textual language called MShema and an editor called Intellipad.  I personally think it’s odd to talk people’s ears off about "model, model, model", and then to use the synonym "schema" to name the language, but all of these names could change before they’re shipped for all we know.

This is a simple example of an MSchema document:

module MyModel
{
    type Person
    {
        LastName : Text;
        FirstName : Text;
    }

    People : Person*;
}

By running this through the "M Compiler", a SQL script is generated that will create the appropriate database objects.  Intellipad is able to verify the correctness of your schema, and what’s really nice is that you don’t even have to specify data types when you start sketching out your model.  Defaults are assumed, and you can get more specific as your model evolves.

MGraph is a language for defining instances of objects, constrained by an MSchema and similar in format.  So MSchema is to MGraph what XSD is to XML.

In this article, Lars Corneliussen explains Microsoft’s vision to make MGraph as common as XML is today.  Take a look at his article to see a side-by-side comparison of the same object represented as XML (POX), JSON, and MGraph, and decide for yourself which you like best (or see below).

MSchema and MGraph are easier and more efficient to read and write than XML.  Their message format resembles typical structured programming languages, and developers are already familiar with these formats.  XML is a fine format for a tool; it’s human readable but not human-friendly.  A C-style language, on the other hand, is much more human-friendly than all of the angle brackets and the redundancy (and verbosity) of tag text.  That narrows down our choice to JSON and MGraph.

In JSON, the property/field/attribute names are delimited by quotation marks, suggesting that the whole structure is a dumb property bag.

{
    "LastName" : "Vanderboom",
    "FirstName" : "Dan"
}

MGraph has a very similar syntax, but its attribute property names are recognized and validated by the parser generated from MSchema, so the quotation marks are unnecessary.  It ends up looking more natural, and a little more concise.

{
    LastName : "Vanderboom",
    FirstName : "Dan"
}

Because MGraph is just a message format, and Microsoft’s service offerings already support multiple message formats (SOAP/POX/JSON/etc.), it wouldn’t disrupt any of their architecture to add an MGraph adapter, and I’ll be shocked if I don’t hear about one in their next release.

Meta-Languages and MGrammar

In the same way that Oslo includes a meta-model because it allows us to define models, it also includes a meta-language because it allows us to define languages (as YACC and ANTLR have done).  However, just as Pinkston doesn’t think data and metadata should be treated different, it makes sense to think of a language that defines languages as just another language.  There is something Zen about that, where the tools somehow seem to bend back upon themselves like one of Escher’s drawings.

DrawingHands

Here is an example language defined by MGrammar in a great article on MSDN called MGrammar in a Nutshell:

module SongSample
{
    language Song
    {
        // Notes
        token Rest = "-";
        token Note = "A".."G";
        token Sharp = "#";
        token Flat = "b";
        token RestOrNote = Rest | Note (Sharp | Flat)?;

        syntax Bar = RestOrNote RestOrNote RestOrNote RestOrNote;
        syntax List(element)
          = e:element => [e]
          | es:List(element) e:element => [valuesof(es), e];

        // One or more bars (recursive technique)
        syntax Bars = bs:List(Bar) => Bars[valuesof(bs)];
        syntax ASong = Music bs:Bars => Song[Bars[valuesof(bs)]];
        syntax Songs = ss:List(ASong) => Songs[valuesof(ss)];

        // Main rule
        syntax Main = Album ss:Songs => Album[ss];

        // Keywords
        syntax Music = "Music";
        syntax Album = "Album";

        // Ignore whitespace
        syntax LF = "\u000A";
        syntax CR = "\u000D";
        syntax Space = "\u0020";

        interleave Whitespace = LF | CR | Space;
    }
}

This is a pretty straight forward way to define a language and generate a parser.  Aside from the obvious keywords to define syntax rules and token patterns (with an alternative and more readable format for regular expressions), the => projection operator allows you to shape the MGraph output according to your needs.

I created two simple languages with MGrammar on the plane trip back to Milwaukee from the PDC in November.  The majority of my time was spent fussing with the editor, Intellipad, and for the last half hour I found it very easy to create a language on the fly, extending and changing it through experimentation quickly and easily.  Projections, which are functional expressions in MGrammar used to shape MGraph output, are the most challenging part.  There are a number of techniques that shape the output graph, so it will be good to see how this is approached in future reference examples.

Surreptitiously announced just before I wrote this, Mike Weinhardt at Microsoft indicated that a gallery of example grammars for MGrammar is being put together, to point to the sample grammars for various languages in addition to grammars that the community develops, and it should be available by the end of this month.  These examples demonstrating how to define languages and write sensible projections, coming from the developers who are putting MGrammar together, will be an invaluable tool for teaching you how to use common patterns (just as 101 LINQ Samples did for LINQ).

As Doug Purdy explained on .NET Rocks: "People who are building a domain specific language, and they don’t want to understand how to build a parser, or they’re not language designers.  Actually, they are language designers.  They design a language, but they actually don’t do the whole thing.  They don’t build a parser.  What they do, they just leverage the XML parser.  And what we’re trying to do is provide a toolset for folks where they don’t have to resort to XML in order to do DSLs."

From the same episode, Don Box said of the DSL session at PDC: "I’ve never seen a session with more geek porn in it."

Don: "It’s like crack for developers.  It’s kind of addictive; it takes over your life."

Doug: "If you want the power of Anders in your hand…"

The Tool Chain

Now that we have a better sense of what’s included in Oslo in terms of languages, editors, and the shared repository, we can look at the relationship among the other pieces, which are manifested in the CTP as a set of command-line tools.  In the future, these will integrate into an IDE, most likely Visual Studio.  (I’d expect Intellipad and Quadrant to merge with Visual Studio, but there’s no guaranty this will happen.)

When you create your model with MSchema, you’ll use m to validate that model and generate a SQL script to create a SQL Server 2008 database schema (yes, it only works right now with SQL Server 2008).  You’ll also use the m command to validate your object graph (written in MGraph) against your schema, and translate that into a set of SQL commands to perform inserts and updates against tables.

With enough models, there’ll be huge value in adding yours to the repository.  If you don’t mind writing MGraph or you generate it automatically with something like an MGraphSerializer class in your code, this may be all you need.

If, on the other hand, you decide you could really benefit by defining your own textual language to use instead of MGraph, you can use MGrammar to define a new language.  This language gets compiled by the mg compiler to create your parser, and the mgx command translates code in your new language into an MGraph, which can then be pulled into your database using m.

This diagram depicts the process:

image

Other than these command-line tools, Quadrant is the highly extensible visual tool for exploring models graphically, and Intellipad is a different face on the same shell for defining DSLs with MGrammar and writing DSL code, as well as writing and verifying MSchema and MGraph code.

We should see fairly soon the convergence of these three languages (MGraph, MSchema, and MGrammar) into a single M language.  This makes sense, since what you want to project in your DSL should be something within your model, verified by your schema.  This may ultimately make these projections much easier to write.

We’ll also see this tool chain absorbed into multiple development environments, eventually with rich binding across multiple representations of our model, although this will take longer in Visual Studio.

Languages and Nested Languages

I looked at some MService examples, and I can understand Damon’s concern that although it’s nice to have "operation" as a keyword in a service-oriented language, with more keywords giving you the ability to specify aspects of each endpoint and the communications patterns required, that enclosing the business logic within that service language is probably not a good idea.  I took this from Dennis van der Stelt’s blog:

service Service
{
  operation PhotoUpload(stream : Stream) : Text
  {
    .PostUriTemplate = "upload";

    index : Text = invoke DateTime.Now.Ticks.ToString();
    filename : Text = "d:\\demo\\photo\\" + index + ".jpg";
    invoke MService.ServiceHelper.StoreInFile(stream, filename);

    return index;
  }
}

Why not?  You’re defining a general purpose language within the curley braces, one capable of defining variables, assigning values, referencing .NET objects, and calling methods.  But why do you want to learn a new language to write services when the language you’re using right now is already supremely capable of that?  Don’t you already know a good syntax for invoking methods (other than "invoke %mehthod%")?  If instead you simply referenced an assembly, type, and method from an MService script, you could externally turn any .NET method with serializable parameters and return value into a service operation by feeding it this kind of file, without having to recompile, and without having to reinvent the wheel.

The possible exception would be if MGrammar adds the ability (as discussed by speakers at the PDC) of supporting multiple layers of enclosing languages within other languages.  In other words, you could use MService to define operations and their attributes using its own syntax, and within the curly braces that follow, use the C# or VB.NET parsers to process the logic with the comprehension of a separate language.  There are some neat possibilities here, but I expect the development community to be conservative and hesitent about mixing layers of semantics, as there is an awful lot of room for confusion and complexity.  It may be better to leave different language blocks in separate files or containers, and to allow them to reference each other as .NET assemblies and XML files reference each other today.

However, I wouldn’t get too hung up on the early versions of these new languages, or any one language specifically.  The useful, sensible ones that take real developer needs into account and provide the most value will be adopted, and many more will quickly fall into disuse.  But the overall pattern will be for the emergence of an amazing amount of leverage in terms of improving human comprehension and taking advantage of our ability to manipulate structured, symbolic object graphs to build and verify software systems.

Resources

After a few months of research and many hours of writing, I don’t feel like I’ve even scratched the surface.  But instead of giving you an absolutely comprehensive picture, I’m going to stop here and continue in future articles.  In the meantime, check out the following resources.

For an overview of the development paradigm, look for information on language-oriented programming, including an article I wrote that alludes to how "we will have to raise the level of abstraction to a point that may be hard for us to imagine with our existing tools and languages" due to the "precipitious growth of software complexity".  The "community of abstractions" is the model in Oslo-speak.

For Microsoft specific content: there were some great sessions at the PDC (watch the recorded videos).  It was covered (with much confusion) on the .NET Rocks! podcast (here and here) as well as on Software Engineering Radio; and there are lots of bloggers talking about their initial experiences with it, such as Shawn Wildermuth, Lars Corneliussen, and of course Chris Sells and Jeff Pinkston.  The most clear and coherent explanation I’ve heard was from an interview with Ron Jacobs and David Chappell (Ron gave the keynote at MSDN Dev Con, hosted the ARCast podcast for years).  MSDN has at least 29 videos on the Oslo Developer Center, where there’s a good amount of information. including a FAQ.  There’s also the online guide for MGrammar, MGrammar in a Nutshell, and the Oslo team blog.

If you’re interested in creating DSLs, make sure to keep a look out for details about the upcoming DSL Developers Conference, which is tentatively planned for April 16-17, immediately following the Lang.NET conference (on general purpose languages) on April 14-16.  I’m hoping to be at both this year.  And in case you haven’t heard, Microsoft is planning another PDC Conference for 2009, the first time ever these conferences have run for two consecutive years!  There will no doubt be much more Oslo news and conference material to cover it at the PDC in November.

Pluralsight, an instructor-led training company, now teaches a two-day "Oslo" Fundamentals course (and Don Box’s blog is hosted there).

The best way to learn about Oslo, however, is to dive in and use it.  That’s what I’m doing with my newest system, which needs to be modeled from scratch.  So if you haven’t done so already, download the Oslo SDK (link updated to January 2009 SDK) and introduce yourself to the future of modeling and development!

[Click here for the next article in this Oslo series, on common misconceptions and fallacies about Oslo.]

Posted in Data Structures, Development Environment, Distributed Architecture, Language Extensions, Language Innovation, Metaprogramming, Oslo, Problem Modeling, SQL Data Services, Service Oriented Architecture, Software Architecture, Visual Studio, Windows Azure | 33 Comments »

A KeyedList Implementation: Syntax Subtleties for an Intuitive API

Posted by Dan Vanderboom on November 28, 2008

[This article is a follow-up on the theme of data structures started in my article on a Non-Binary Tree Data Structure.] 

Clear API Design

When designing object models, there are often times when a Dictionary is the best choice for rapid lookup and access of items.  However, when attempting to make those object models as intuitive and simple as possible, sometimes the fact that a dictionary is being used is an implementation detail and needn’t be exposed externally.

Take this model, for example:

class Database
{
    public IEnumerable<Table> Tables;

    public Database()
    {
        _Tables = new List<Table>();
    }
}

class Table
{
    public string TableName;

    public Table(string TableName)
    {
        this.TableName = TableName;
    }
}

This is nice and simple.  I can loop through the Tables collection like this:

Database db = new Database();

foreach (var table in db.Tables)
{
    Console.WriteLine(table.TableName);
}

So it's too bad that the consumer end of this has to change when we use a dictionary.  By making a change to our Database type...

class Database
{
    public Dictionary<string, Table> Tables;

    public Database()
    {
        Tables = new Dictionary<string, Table>();
    }
}

... we now have to specify the "Values" collection to get an IEnumerable<T> of the correct type.

foreach (var table in db.Tables.Values)
{
    Console.WriteLine(table.TableName);
}

The consumers of our API may not care about iterating through key-value pairs, but now they have to remember to use this Values property or face the wrath of red squiggles and compiler errors when they forget.  Of course, there’s a pattern we could use to hide this dictionary inner goo from the outside.

class Database
{
    public IEnumerable<Table> Tables { get { return _Tables.Values; } }
    private Dictionary<string, Table> _Tables { get; set; }

    public Database()
    {
        _Tables = new Dictionary<string, Table>();
    }
}

Now from the outside, we can foreach over db.Tables, but inside we can use the Dictionary for fast access to elements by key.

The Need for a Dictionary-List Hybrid

This is an either-or approach: that is, it assumes that the API consumer is better off with an IEnumerable collection and won’t have any need for keyed access to data (or even adding data, in this case).  How can we have the best of both words, with the ability to write this kind of code?

Database db = new Database();

foreach (var table in db.Tables)
{
    Console.WriteLine(table.TableName);
}

Table t = db.Tables["Customers"];

This is a hybrid of a Dictionary and a List.  (Don’t confuse this with a HybridDictionary, which is purely a dictionary with runtime-adapting storage strategies.)  It provides an IEnumerable<T> enumerator (instead of IEnumerable<K, T>), as well as an indexer for convenient lookup by key.

There’s another aspect of working with dictionaries that has always bugged me:

db.Tables.Add("Vendors", new Table("Vendors"));

This is repetitive, plus it says the same thing twice.  What if I misspell my key in one of these two places?  What I’d really like is to tell my collection which property of the Table class to use, and have it fill in the key for me.  How can I do that?  Well, I know I can select a property value concisely (in a compiler-checked and refactoring-friendly way) with a lambda expression.  So perhaps I can supply that expression in the collection’s constructor.  I decided to call my new collection KeyedList<K, T>, which inherits from Dictionary so I don’t have to do all the heavy lifting.  Here’s how construction looks:

Tables = new KeyedList<string, Table>(t => t.TableName);

Now I can add Table objects to my collection, and the collection will use my lambda expression to fill in the key for me.

Tables.Add(new Table("Vendors"));

How does this work, exactly?  Here's a first cut at our KeyedList class:

public class KeyedList<K, T> : Dictionary<K, T>, IEnumerable<T>
{
    private Func<T, K> LinkedValue;

    public KeyedList()
    {
        LinkedValue = null;
    }

    public KeyedList(Func<T, K> LinkedValue)
    {
        this.LinkedValue = LinkedValue;
    }

    public void Add(T item)
    {
        if (LinkedValue == null)
            throw new InvalidOperationException("Can't call KeyedList<K, T>.Add(T) " +
                "unless a LinkedValue function has been assigned");

        Add(LinkedValue(item), item);
    }

    public new IEnumerator<T> GetEnumerator()
    {
        foreach (var item in Values)
        {
            yield return item;
        }
        yield break;
    }
}

This is still pretty simple, but I can think of one thing that it’s missing (aside from a more complete IList<T> implementation).  With a collection class like this, with tightly-integrated knowledge about the relationship between the key property of an item and the key in the Dictionary, what happens when we change that key property in the item?  Suddenly it doesn’t match the dictionary key, and we have to remember to update this in an explicit separate step in our code whenever this happens.  It seems that this is a great opportunity to forget something and introduce a bug into our code.  How could our KeyedCollection class track and update this for us?

Unfortunately, there’s no perfect solution.  “Data binding” in .NET is weak in my opinion, and requires implementation of INotifyPropertyChanged in our classes to participate; and when it does so, we only get notification of the property name that changed (supplied as a string), and have no idea what the old value was unless we store that somewhere ourselves.  Automatically injecting all classes with data binding code isn’t practical, of course, even using AOP (since many BCL classes, for example, reside in signed assemblies).  Hopefully a future CLR will be able to perform some tricks, such as intelligently and dynamically modifing those classes, for which other class’s data binding code specify interest, so we can have effortless and universal data binding.

Now back to reality.  I want to mention that although my code typically works just as well in Compact Framework as it does in Full Framework, I’m going off the reservation here.  I’m going to be using expression trees, which are not supported in Compact Framework at all.

Expressions

The Expression class (in System.Linq.Expressions) is really neat.  With it, you can wrap a delegate type to create an expression tree, which you can explore and modify, and at some point even compile into a function which you can invoke.  The best part is that lambda expressions can be assigned to Expression types in the same way that they can be assigned to normal delegates.

Func<int> func = () => 5;
Expression<Func<int>> expr = () => 5;

The first line defines a function that returns an int, and a function is supplied as a lambda that returns the constant 5.  The second line defines an expression tree of a function that returns an int.  This extra level of indirection allows us to take a step back and look at the structure of the function itself in a precompiled state.  The structure is a tree, which can be arbitrarily complex.  You can think of this as a way of modeling the expression in a data structure.  While func can be executed immediately, expr requires that we compile it by calling the Compile method (which generates IL for the method and returns Func<int>).

int FuncResult1 = func.Invoke();
int FuncResult2 = func();

int ExprResult1 = expr.Compile().Invoke();
int ExprResult2 = expr.Compile()();

The first two lines are equivalent, as are the last two.  I just wanted to point out here, with the two ways of calling the functions, how they are in fact the same, even though the last line looks funky.

Synchronizing Item & Dictionary Keys

So why do we need expressions?  Because we need to know the name of the property we’ve supplied in our KeyedList constructor.  You can’t extract that information out of a function (supplied as a lambda expression or otherwise).  But expressions contain all the metadata we need.  Note that for this synchronization to work, it requires that the items in our collection implement INotifyPropertyChanged.

class Table : INotifyPropertyChanged
{
    public event PropertyChangedEventHandler PropertyChanged;

    private string _TableName;
    public string TableName
    {
        get { return _TableName; }
        set
        {
            _TableName = value;

            if (PropertyChanged != null)
                PropertyChanged(this, new PropertyChangedEventArgs("TableName"));
        }
    }

    public Table(string TableName)
    {
        this.TableName = TableName;
    }
}

This is tedious work, and though there are some patterns and code snippets I use to ease the burden a little, it’s still a lot of work to go through to implement such a primitive ability as data binding.

In order to get at the expression metadata, we’ll have to update our constructor to ask for an expression:

public KeyedList(Expression<Func<T, K>> LinkedValueExpression)
{
    this.LinkedValueExpression = LinkedValueExpression;
}

We’ll also need to define a field to store this, and a property will help to compile it for us.

private Func<T, K> LinkedValue;

private Expression<Func<T, K>> _LinkedValueExpression;
public Expression<Func<T, K>> LinkedValueExpression
{
    get { return _LinkedValueExpression; }
    set
    {
        _LinkedValueExpression = value;
        LinkedValue = (value == null) ? null : _LinkedValueExpression.Compile();
    }
}

Now that the groundwork has been set, we can hook into the PropertyChanged event if it’s implemented, which we do by shadowing the Add method.

public new void Add(K key, T item)
{
    base.Add(key, item);

    if (typeof(INotifyPropertyChanged).IsAssignableFrom(typeof(T)))
        (this[key] as INotifyPropertyChanged).PropertyChanged += new PropertyChangedEventHandler(KeyList_PropertyChanged);
}

One caveat about this approach: our shadowing method Add will unfortunately not be called if accessed through a variable of the base class.  That is, if you assign a KeyedList object to a Dictionary object, and call Add from that Dictionary variable, the Dictionary.Add method will be called and not KeyedList.Add, so synchronization of keys will not work properly in that case.  It’s extremely rare that you’d do such a thing, but I want to point it out regardless.  As inheritor of a base class, I would prefer the derived class be in fuller control of these behaviors, but we work with what we have.  I’ll actually take advantage of this later on in a helper method.

Finally, the tricky part.  We need to examine our lambda’s expression tree and extract the property or field name from it.  We’ll compare that to the property name reported to us as changed.  The comparison is actually done between two MemberInfo variables, which is possible because reflection ensures that only one MemberInfo object will exist for each member.  The MemberExpression object, which iniherits from Expression, possesses a Member property, and the other we get from typeof(T).GetMember.  Here’s what that looks like:

private void KeyList_PropertyChanged(object sender, PropertyChangedEventArgs e)
{
    var lambda = LinkedValueExpression as LambdaExpression;
    if (lambda == null)
        return;

    var expr = lambda.Body as MemberExpression;
    if (expr == null)
        return;

    MemberInfo[] members = typeof(T).GetMember(e.PropertyName, MemberTypes.Property | MemberTypes.Field, BindingFlags.Instance | BindingFlags.Public);
    if (members.Length == 0)
        throw new ApplicationException("Field or property " + e.PropertyName + " not found in type " + typeof(T).FullName);

    MemberInfo mi = members[0];
    if (mi == expr.Member)
    {
        // we don't know what the old key was, so we have to find the object in the dictionary
        // then remove it and re-add it
        foreach (var kvp in KeyValuePairs)
        {
            if ((typeof(T).IsValueType && kvp.Value.Equals(sender)) || (kvp.Value as object) == (sender as object))
            {
                T item = this[kvp.Key];
                Remove(kvp.Key);
                Add(item);
                return;
            }
        }
    }
}

This code makes an important assumption; namely, that a lambda expression will be used, which will contain a single field or property access.  It does not support composite or calculated keys, such as (t.SchemaName + “.” + t.TableName), though it’s possible.  I’m currently working on a method that recursively explores an Expression tree and checks for member access anywhere in the tree, to support scenarios like this.  For now, and for the purpose of this article, we’ll stick to the simple case of single member access.

I found that having access to the list of KeyValuePairs was actually useful in my code, and to keep the PropertyChanged handler concise, I added a new KeyValuePairs property to expose the base Dictionary’s enumerator, which you can find in the complete listing of the KeyedList class toward the end of this article.  I now have two iterators; and the way I’ve flipped it around, the default iterator of the base class has become a secondary, named iterator of KeyedList.

Here is a test program to demonstrate the functionality and flexibility of the KeyedList class.

class Program
{
    static void Main(string[] args)
    {
        // use a lambda expression to select the member of Table to use as the dictionary key
        var Tables = new KeyedList<string, Table>(t => t.TableName);

        // add a Table object to our KeyList like an old-fashioned Dictionary
        Tables.Add("Customers", new Table("Customers"));

        // add a Table object to our KeyList without explicitly specifying a key
        Tables.Add(new Table("Vendors"));

        // prove that a change in an item's key property updates the corresponding dictionary key
        Table Vendors = Tables["Vendors"];
        Vendors.TableName = "VENDORS";
        Console.WriteLine("Is 'Vendors' a valid dictionary key? " + Tables.ContainsKey("Vendors"));
        Console.WriteLine("Is 'VENDORS' a valid dictionary key? " + Tables.ContainsKey("VENDORS"));

        // prove that the IEnumerable<T> iterator works
        // note that we don't have to loop through Tables.Values
        foreach (var table in Tables)
        {
            Console.WriteLine(table.TableName);
        }

        Console.ReadKey();
    }
}

Complete Source for KeyedList

For your convenience, here is the complete listing of KeyedList.

public class KeyedList<K, T> : Dictionary<K, T>, IEnumerable<T>
{
    private Func<T, K> LinkedValue;

    private Expression<Func<T, K>> _LinkedValueExpression;
    public Expression<Func<T, K>> LinkedValueExpression
    {
        get { return _LinkedValueExpression; }
        set
        {
            _LinkedValueExpression = value;
            LinkedValue = (value == null) ? null : _LinkedValueExpression.Compile();
        }
    }

    public KeyedList()
    {
        LinkedValueExpression = null;
    }

    public KeyedList(Expression<Func<T, K>> LinkedValueExpression)
    {
        this.LinkedValueExpression = LinkedValueExpression;
    }

    public void Add(T item)
    {
        if (LinkedValue == null)
            throw new InvalidOperationException("Can't call KeyedList<K, T>.Add(T) " +
                "unless a LinkedValue function has been assigned");

        Add(LinkedValue(item), item);
    }

    public new void Add(K key, T item)
    {
        base.Add(key, item);

        if (typeof(INotifyPropertyChanged).IsAssignableFrom(typeof(T)))
            (this[key] as INotifyPropertyChanged).PropertyChanged += new PropertyChangedEventHandler(KeyList_PropertyChanged);
    }

    private void KeyList_PropertyChanged(object sender, PropertyChangedEventArgs e)
    {
        var lambda = LinkedValueExpression as LambdaExpression;
        if (lambda == null)
            return;

        var expr = lambda.Body as MemberExpression;
        if (expr == null)
            return;

        MemberInfo[] members = typeof(T).GetMember(e.PropertyName,
            MemberTypes.Property | MemberTypes.Field,
            BindingFlags.Instance | BindingFlags.Public);
        if (members.Length == 0)
            throw new ApplicationException("Field or property " + e.PropertyName + " not found in type " + typeof(T).FullName);

        MemberInfo mi = members[0];
        if (mi == expr.Member)
        {
            // we don't know what the old key was, so we have to find the object in the dictionary
            // then remove it and re-add it
            foreach (var kvp in KeyValuePairs)
            {
                if ((typeof(T).IsValueType && kvp.Value.Equals(sender))
                    || (!typeof(T).IsValueType && (kvp.Value as object) == (sender as object)))
                {
                    T item = this[kvp.Key];
                    Remove(kvp.Key);
                    Add(item);
                    return;
                }
            }
        }
    }

    public new IEnumerator<T> GetEnumerator()
    {
        foreach (var item in Values)
        {
            yield return item;
        }
        yield break;
    }

    public IEnumerable<KeyValuePair<K, T>> KeyValuePairs
    {
        get
        {
            // because GetEnumerator is shadowed (to provide the more intuitive IEnumerable<T>),
            // and foreach'ing over "base" isn't allowed,
            // we use a Dictionary variable pointing to "this"
            // so we can use its IEnumerable<KeyValuePair<K, T>>
            Dictionary<K, T> dict = this;
            foreach (var kvp in dict)
            {
                yield return kvp;
            }
            yield break;
        }
    }
}

Conclusion

As with the non-binary Tree data structure I created in this article, I prefer to work with more intelligent object containers that establish tighter integration between the container itself and the elements they contain.  I believe this reduces mental friction, both for author and consumer of components (or the author and consumer roles when they are the same person), and allows a single generic data structure to be used where custom collection classes are normally defined.  Additionally, by using data structures that expose more flexible surface areas, we can often reap the benefits of having powerful lookup features without locking ourselves out of the simple List facades that serve so well in our APIs.

The Achilles Heel of this solution is a weak and un-guaranteed data binding infrastructure combined with lack of support for Expressions in Compact Framework.  In other words, items have to play along, and it’s not platform universal.

Clearly, more work needs to be done.  Event handlers need to be unhooked when items are removed, optimizations could be done to speed up key synchronization, and more complex expressions could be supported without too much trouble.  Ultimately what I’d like to see is a core collection class with the ability to add and access multiple indexes (such as with a database), instead of presuming that a Dictionary with its solitary key is all we can or should use.  The hashed key of a Dictionary seems like a great adornment to an existing collection class, rather than a hard-coded stand-alone structure.  But I think this is a good start toward addressing some of the fundamental shortcomings of these existing approaches, and hopefully demonstrating the value of more intelligent collection and container classes.

Other Implementations

I was at a loss for a while as to what to call this collection class; I considered DictionaryList (and ListDictionary), as well as HashedList, before arriving at KeyedList.  To my amazement, I found several other implementations of the same kind of data structure with the same name, so it must be a good name.  The implementations here and here are more complete than mine, but neither auto-assign keys with a key-selection function or update dictionary keys using data binding, which ultimately is what I’m emphasizing here.  Hint: it wouldn’t be tough to combine what I have here with either of them.

Posted in Algorithms, Data Binding, Data Structures, Design Patterns, Object Oriented Design, Reflection | 3 Comments »

Technology Research Highlights: C5, YIIHAW, and MemJet

Posted by Dan Vanderboom on October 7, 2008

When I’m not designing or writing code, I’m usually doing some kind of research.  Looking for new gadgets, toys, technologies, groundbreaking research, or following my favorite blogs and podcasts.  Because I haven’t been highly focused on any single project lately, I thought I’d share some of the more interesting things I’ve come across.

Peter Sestoft, C5 Generics Collection Library, and YIIHAW

Peter Sestoft is a brilliant professor at the very old (founded 1479) Copenhagen IT University in Denmark, which actually has a student body consisting of a female majority (59%)!  I discovered Sestoft’s works while doing research for some new, more powerful collection class I was working on (Tree<T> and Set<T>).  He was featured on Channel 9 back in January on a show primarily about the C5 Collection Library.

C5 Generics Collection Library

The C5 Collection Library is an extremely powerful and well-designed library, based on earlier Java and Smalltalk collection library designs, and completely blows away the standard collection classes delivered as part of the .NET Framework.  An earlier version was covered well in a Dr. Dobb’s article, but to summarize: C5 is interface-based, provides convenient but hard-to-implement functionality (such as persistent sorted collections and sublist views), provides useful abstractions such as lists, sets, bags, priority queues, double-ended queues, stacks, dictionaries, and optimizes access of these data structures for asymptotic complexity.

Unless you’re an expert—as with encryption—if you find yourself creating your own complex data structures, chances are you’re going to do it wrong.  So it’s nice to know that the mathematical experts have taken the time to very carefully design and construct a wide assortment of useful data structures for the rest of us to use, complete with unit tests.  And the best part, in the spirit of true science, C5 has a very liberal open source license, so it can be used in open source and commercial software alike.  It’s reportedly being used by the military, video game shops, web server software, and more.

Sestoft and his colleagues have the excellent habit of providing copious documentation: a book’s worth, in fact!  It can be found online here.

I think it’s an excellent candidate for being included in the .NET Framework itself, and I highly recommend that you check it out the next time you need a fancy data structure and are tempted to write it yourself, and to consider for optimizing performance of existing code that relies heavily on data structures (which is almost everywhere).

Getting the fundamentals right is probably the hardest thing to do.  It’s easy (and tempting) to build as tall as we can, but getting a firm foundation in place is much more important.  That’s why the C5 Collections Library is so amazing.  So what are you waiting for?  Go download it!

YIIHAW Is an Intelligent and High-performing Aspect Weaver

What really gets me excited is Sestoft’s YIIHAW, a high-performance static aspect weaver for .NET.

Aspect Oriented Programming (AOP) is a paradigm for addressing cross-cutting concerns in software.  That is, logic that operates orthogonally to business logic, such as logging, security, configuration, and more.  There are a number of open source aspect weavers available today, and they generally fall into two camps: static or compile-time weavers, and dynamic or runtime weavers.

While incredibly useful, several factors have kept AOP from entering the mainstream.  For one, a good set of tools hasn’t yet emerged to enable easy debugging of aspects.  When we step through lines of code in Visual Studio, the debugger has no way of knowing if or when some other injected piece of code should pick up the thread of execution, since weaving isn’t a natively recognized and supported activity within the debugger.

Another hindrance is performance: aspect weavers typically inject logic (called advices) in the form of method calls at the beginning and/or ending of existing methods (the locations are called join points) which are selected with special selector statements (called pointcuts).  The typical trick, as I just alluded to, is to include a new method in the class (called an introduction) and then insert method invocation statements at the beginning and end of the methods to be modified.  This isn’t a big deal in many cases, but in performance-critical sections of calls, the additional method calls and the stack manipulations that accompany them prevent these techniques from being practical in performance-critical scenarios.

Data structures play a pivotal role in high-performance algorithms, of course, and Sestoft’s experience in building the C5 Collections Library provided the perfect context in which to consider how some of the specific features of his data structures might adversely affect performance.  For example: his collections fire events to inform any interested parties that a collection had an item inserted, deleted, or modified in some other way, eliminating the need for defining specialized collections that trigger specific actions outside of that collection (see my article on a Tree<T> structure I created that makes use of this).

If the C5 library could start with the basic, slimmed down, super high-performant data structures, and then “weave in” aspects to include features such as event notifications only when needed, the problem might be solved.  The problem with the typical approach, however, is that insertion of method calls provides as much or more overhead than the simple checks to activate those features to begin with (if not null, then invoke event), so he knew he’d have to go with a different approach.

After evaluating the capabilities and measuring the performance of existing open source aspect weavers, they were all found to be inadequate, and the YIIHAW thesis project was born.  You can find the fascinating thesis (which I just finished reading) here.  In a nutshell, YIIHAW employs an advanced technique called advice inlining, in which CIL instructions are woven directly into their target methods.  There are a surprising number of challenges to this approach, including the need to convert the original methods RET (return) instructions to jumps, remapping jump addresses, updating the method stack size, and much more.  The students who worked on this thesis (Rasmus Johansen and Stephan Spangenberg), under Sestoft’s direction, did a masterful job of overcoming all of these challenges, the result of which is a highly-capable static aspect weaver with no performance penalty!

This is truly amazing, and it enables aspects to be woven into an assembly repeatedly: one for feature A, another for feature B, etc., all with the assurance of type safety and performance equivalent to writing the code by hand with all features included in the first place.

Another positive of the YIIHAW thesis is the light it shines on CIL instructions, their organization and function.  The advice inlining techniques provide a great tutorial for those wanting a deeper understanding of how to manipulate things at the CIL level.  Many who have attempted AOP have either not considered this approach, or else turned away from this path due to the complexity and challenge.  Not so with the YIIHAW team, who faced the problem head on and accomplished something fundamentally great as a result.  The fantastic Mono.Cecil library, despite being poorly documented, is used for analysis and manipulation of CIL code, and several other ideas came to mind as I saw how this tool was put to use.

My only criticism of their application of this technology for the generation of specialized data structures at compile time is that you may find yourself needing different versions of the same data structure in different parts of your application.  In a class that performs analysis of visual data for extracting spatial cues, you may need a Tree with as little overhead as possible (i.e., no event notifications, no serialization capability), and in the user interface, you may need another Tree with those additional characteristics.  By making a single decision at compile/weave time, your options will be limited at runtime.

There’s also the issue of supporting the debugging process, which can’t be dismissed lightly.  But YIIHAW has gone to extreme measures to overcome some of AOP’s largest obstacles; and in addition to interceptions (weaving logic into existing methods), YIIHAW also supports introductions (adding new types and members), and typestructure modifications (changing a class’s base class, or making a class implement an interface).

Overall, YIIHAW’s potential is huge, and I believe that it won’t be long before the tooling catches up to the point where AOP becomes a mainstream approach to complex cross-cutting systems problems, simplifying designs and the engineering process overall.

Memjet Printer

Color printing is about to get a whole lot faster.  At 60 pages per minute (ppm), the Memjet printer, developed by Kia Silverbrook, a prolific inventor with over 1400 patents.  Check out the YouTube video to see it in action.

I’ve seen a lot of speculation that Memjet is a hoax, and a technological leap forward this large certainly warrants skepticism.  However, the technology was written about in a very respectable science magazine, one of my favorites: Scientific American, in June 2007.  It’s also been covered in several articles by PC Magazine, for example in May 2007, which mentions the Memjet appearing at the 2007 Global Ink Jet Printing Conference.  (Talk about specific conferences!)

The print heads are page-width, so they don’t have to move back and forth across the page.  The “MEM” in Memjet stands for microelectromechanical, and refers to the fabrication process that involves CMOS chips on a silicon wafer for precision layout.

With ink spraying at 900 million 1-2 picoliter drops per second, drop size small enough to dry within a second, and an estimated price tag under $200 (compared to the comparable HP Edgeline at $16,000), waiting around for long documents to print will soon be a thing of the past for everyone.  The first devices are expected to appear in 2009.

Posted in Algorithms, Aspect Oriented Programming, Data Structures, Design Patterns, Research | Leave a Comment »

Concurrency & Coordination With Futures in C#

Posted by Dan Vanderboom on July 3, 2008

A future is a proxy or placeholder for a value that may not yet be known, usually because the calculation is time consuming.  It is used as a synchronization construct, and it is an effective way to define dependencies among computations that will execute when all of their factors have been calculated, or in other words, to construct an expression tree with each node potentially computing in parallel.  According to the Wikipedia article on futures and promises, using them can dramatically reduce latency in distributed systems.

Damon pointed out that the Parallel Extensions library contains a Future<T> class, so I started looking around for examples and explanations of how they work, what the syntax is like, and I ran across a frightening example implementing the asynchronous programming model with Future<T>, as well as going in the other direction, wrapping an APM implementation with Future<T>.  Other articles give pretty good explanations but trivial examples.  From what I gathered briefly, the ContinueWith method for specifying the next step of calculation to process doesn’t seem to provide an intuitive way to indicate that several calculations may be depending upon the current one (unless it can be called multiple times?).  Using ContinueWith, you’re always specifying forward the calculation task that depends on the current future object.  It also surprised me a little that Future inherits from Task, because my understanding of a future is that it’s primarily defined as a value-holding object.  But considering that a future really holds an expression that needs to be calculated, making Future a Task doesn’t seem so odd.

So I decided to implement my own Future<T> class before looking at the parallel extensions library too deeply.  I didn’t want to prejudice my solution, because I wanted to make an exercise of it and see what I would naturally come up with.

Though I tried avoiding prejudice, I still wound up characterizing it in my head as an task, and thought that a future would simply be a pair of Action and Reaction methods (both of the Action delegate type).  The Action would execute and could do whatever it liked, including evaluate some expression and store it in a variable.  If the Action completed, the Reaction method (a continuation) would run, and these could be specified using lambdas.  Because I was storing the results in a local variable (result), swallowed up and made accessible with a closure, I didn’t see a need for a Value property in the future and therefore no need to make the type generic.  Ultimately I thought it silly to have a Reaction method, since anything you needed to happen sequentially after a successful Action, you could simply store at the end of the Action method itself.

FutureTask task = new FutureTask(
    () => result = CalculatePi(10),
    () => new FutureTask(
        () => result += "...",
        () => Console.WriteLine(result),
        ex2 => Console.WriteLine("ex2: " + ex2.Message)),
    ex1 => Console.WriteLine("ex1: " + ex1.Message));

The syntax is what I was most concerned with, and as I started playing around with nesting of futures to compose my calculation, I started to feel like I was onto something.  After all, it was almost starting to resemble some of the F# code I’ve been looking at, and I took that style of functional composition to be a good sign.  As you can see from the code above, I also include a constructor parameter of type Action<Exception> for compensation logic to run in the event that the original action fails.  (The result variable is a string, CalculatePi returns a string, and so the concatenation of the ellipsis really does make sense here.)

The problem that started nagging me was the thought that a composite computation of future objects might not be able to be defined all in one statement like this, not building the dependency tree from the bottom up.  You can really only define the most basic factors (the leaf nodes of a dependency tree) at the beginning this way, and then the expressions that depend directly upon those leaf nodes, etc.  What if you have 50 different starting future values, and you can only proceed with the next step in the calculation once 5 of those specific futures have completed evaluation?  How would you express those dependencies with this approach?

That’s when I started to think about futures as top-down hierarchical data container objects, instead of tasks that have pointers to some next task in a sequence.  I created a Future<T> class whose constructor takes an optional name (to aid debugging), a method of type Func<T> (which is a simple expression, supplied as a lambda in my examples), and finally an optional params list of other Future<T> objects on which that future value depends.

The first two futures in the code below start calculating pi (3.1415926535) and omega (which I made up to be a string of 9s).  They have no dependencies, so they can start calculating right away.  The paren future has two dependencies, supplied as two parameters at the end of the argument list: pi and omega.  You can see that the values pi.Value and omega.Value are used in the expression, which will simply surround the concatenated string value with parentheses and spaces.

var pi = new Future<string>("pi", () => CalculatePi(10));
var omega = new Future<string>("omega", () => CalculateOmega());

var paren = new Future<string>("parenthesize", () => Parenthesize(pi.Value + " < " + omega.Value), pi, omega);

var result = new Future<string>("bracket", () => Bracket(paren.Value), paren);

Finally, the result future has a dependency on the paren future.  This surrounds the result of paren.Value with brackets and spaces.  Because the operations here are trivial, I’ve added Thread.Sleep statements to all of these methods to simulate more computationally expensive work.

Dependencies Among Futures

The program starts calculating pi and omega concurrently, and then immediately builds the paren future, which because of its dependencies waits for completion of the pi and omega futures.  But it doesn’t block the thread.  Execution continues immediately to build the result future, and then moves on to the next part of the program.  When each part of the expression, each future, completes, it will set a Complete boolean property to true and invoke a Completed event.  Any attempt to access the Value property of one of these futures will block the thread until it (and all of the futures it depends on) have completed evaluation.

Furthermore, if an exception occurs, all of the futures that depend on it will no longer attempt to evaluate, and the exceptions will be thrown as part of an AggregateException when accessing the Value property.  This AggregateException contains all of the individual exceptions that were thrown as part of evaluating each future expression.  If both pi and omega fail, result should be able to hand me a list of all Exceptions below it in the tree structure that automatically gets formed.

There are two bits of code I added as icing on this cake.  The first is the use of the implicit operator to convert a variable of type Future<T> to type T.  In other words, if you have a Future<string> called result, you can now pass result into methods where a string parameter is expected, etc.  In the code listing at the end of the article, you’ll notice that I reference pi and omega instead of pi.Value and omega.Value (as in the code snippet above).

public static implicit operator T(Future<T> Future)
{
    return Future.Value;
}

The other helpful bit is an override of ToString, which allows you to hover over a future variable in debug mode and see its name (if you named it), whether it’s Complete or Incomplete, and any errors encountered during evaluation.

public override string ToString()
{
    return Name + ", " + (Complete ? "Complete" : "Incomplete") + (Error != null ? "Error=" + Error.Message : string.Empty);
}

Debug Experience of Future

What I’d really like to do is have the ability to construct this composite expression in a hierarchical form in the language, with a functionally composed syntax, replacing any parameter T with a Future<T>, something like this:

var result = new Future<string>("bracket", () => Bracket(
    new Future<string>("parenthesize", () => Parenthesize(
        new Future<string>("pi", () => CalculatePi(10))
        + " < " +
        new Future<string>("omega", () => CalculateOmega())
    ))
));

The Bracket and Parenthesize methods both require a string, but I give them an object that will at some point (“in the future”) evaluate to a string.  Another term used for future is promise, although there is a distinction in some languages that support both, but you can think in terms of giving those methods the promise that they’ll get a string later, at which time they can proceed with their own evaluation.  This effectively creates lazy evaluation, sometimes referred to as normal-order evaluation.

There are a few problems with this code, however.  First of all, though it’s composed functionally from the top down and returns the correct answer, it takes too long to do it: about 8 seconds instead of 4.  That means it’s processing all of the steps sequentially.  This happens because the future objects we’re handing to the Parenthesize and Bracket methods have to be converted from Future<string> to string before they can be evaluated in the expression, and doing that activates the implicit operator, which executes the Value property getter.  This completely destroys the asynchronous behavior we’re going for, by insisting on resolving it immediately with the wait built-into the Value property.  The string concatenation expression evaluates sequentially one piece at a time, and when that’s done, the next level up evaluates, and so on.

The solution is to declare our futures as factors we depend on at each level, which start them executing right away due to C#’s order of evaluation, and declare the operations we want to perform in terms of those predefined futures.  After a few hours of rearranging definitions, declaration order, and experimenting with many other details (including a brief foray into being more indirect with Func<Future<T>>), this is the working code I came up with:

Future<string> FuturePi = null, FutureOmega = null, FutureConcat = null, FutureParen = null;

var result = new Future<string>(
    () => Bracket(FutureParen),
    (FutureParen = new Future<string>(
        () => Parenthesize(FutureConcat),
        (FutureConcat = new Future<String>(
            () => FuturePi + " < " + FutureOmega,
            (FuturePi = new Future<string>(() => CalculatePi(10))),
            (FutureOmega = new Future<string>(() => CalculateOmega()))
        ))
    ))
);

In F# and other more functional languages, I imagine we could use let statements to define and assign these variables as part of the overall expression, instead of having to define the variables in a separate statement as shown here.

The Future<T> class I wrote works fairly well for exploration and study of futures and the possible syntax to define them and access their values, and I’ll share it so that you can experiment with it if you like, but understand that this is (even more so than usual) not production ready code.  I’m making some very naive assumptions, not taking advantage of any task managers or thread pools, there is no intelligent scheduling going on, and I haven’t tested this in any real world applications.  With that disclaimer out of the way, here it is, complete with the consuming test code.

using System;
using System.Collections.Generic;
using System.Threading;
using System.Linq;

namespace FutureExpressionExample
{
    class Program
    {
        static void Main(string[] args)
        {
            DateTime StartTime = DateTime.Now;

            Future<string> FuturePi = null, FutureOmega = null, FutureConcat = null, FutureParen = null;

            var result = new Future<string>("bracket",
                () => Bracket(FutureParen),
                (FutureParen = new Future<string>("parenthesize",
                    () => Parenthesize(FutureConcat),
                    (FutureConcat = new Future<String>("concat",
                        () => FuturePi + " < " + FutureOmega,
                        (FuturePi = new Future<string>("pi", () => CalculatePi(10))),
                        (FutureOmega = new Future<string>("omega", () => CalculateOmega()))
                    ))
                ))
            );

            /* Alternative

            // first group of expressions evaluating in parallel
            var pi = new Future<string>("pi", () => CalculatePi(10));
            var omega = new Future<string>("omega", () => CalculateOmega());

            // a single future expression dependent on all of the futures in the first group
            var paren = new Future<string>("parenthesize", () => Parenthesize(pi + " < " + omega), pi, omega);

            // another single future expression dependent on the paren future
            var result = new Future<string>("bracket", () => Bracket(paren), paren);

            */

            Console.WriteLine("Do other stuff while calculation occurs...");

            try
            {
                Console.WriteLine("\n" + result);
            }
            catch (AggregateException ex)
            {
                Console.WriteLine("\n" + ex.Message);
            }

            TimeSpan ts = DateTime.Now - StartTime;
            Console.WriteLine("\n" + ts.TotalSeconds.ToString() + " seconds");

            Console.ReadKey();
        }

        static string CalculatePi(int NumberDigits)
        {
            //throw new ApplicationException("Failed to calculate Pi");
            Thread.Sleep(3000);
            return "3.1415926535";
        }

        static string CalculateOmega()
        {
            //throw new ApplicationException("Failed to calculate Omega");
            Thread.Sleep(3000);
            return "999999999999999";
        }

        static string Parenthesize(string Text)
        {
            Thread.Sleep(500);
            return "( " + Text + " )";
        }

        static string Bracket(string Text)
        {
            Thread.Sleep(500);
            return "[ " + Text + " ]";
        }
    }

    public class Future<T> : IDisposable
    {
        public string Name { get; set; }
        public bool Complete { get; protected set; }
        public Exception Error { get; protected set; }

        protected Func<T> Expression { get; set; }

        protected List<Future<T>> Factors;
        protected List<Future<T>> FactorsCompleted;
        protected List<Future<T>> FactorsFailed;

        public event Action<Future<T>> Completed;
        protected void OnCompleted()
        {
            Complete = true;

            if (Completed != null)
                Completed(this);
        }

        private T _Value;
        public T Value
        {
            get
            {
                // block until complete
                while (!Complete)
                {
                    Thread.Sleep(1);
                }

                if (Exceptions.Count > 0)
                    throw new AggregateException(Exceptions);

                return _Value;
            }
            private set { _Value = value; }
        }

        public List<Exception> Exceptions
        {
            get
            {
                var list = new List<Exception>();

                foreach (Future<T> Factor in Factors)
                {
                    list.AddRange(Factor.Exceptions);
                }

                if (Error != null)
                    list.Add(Error);

                return list;
            }
        }

        public static implicit operator T(Future<T> Future)
        {
            return Future.Value;
        }

        // naming a Future is optional
        public Future(Func<T> Expression, params Future<T>[] Factors) : this("<not named>", Expression, Factors) { }

        public Future(string Name, Func<T> Expression, params Future<T>[] Factors)
        {
            this.Name = Name;
            this.Expression = Expression;
            this.Factors = new List<Future<T>>(Factors);

            FactorsCompleted = new List<Future<T>>();
            FactorsFailed = new List<Future<T>>();

            foreach (Future<T> Factor in this.Factors)
            {
                if (Factor.Complete)
                    FactorsCompleted.Add(Factor);
                else
                    Factor.Completed += new Action<Future<T>>(Factor_Completed);
            }

            // there may not be any factors, or they may all be complete
            if (FactorsCompleted.Count == this.Factors.Count)
                Expression.BeginInvoke(ReceiveCallback, null);
        }

        private void Factor_Completed(Future<T> Factor)
        {
            if (!FactorsCompleted.Contains(Factor))
                FactorsCompleted.Add(Factor);

            if (Factor.Error != null && !FactorsFailed.Contains(Factor))
                FactorsFailed.Add(Factor);

            Factor.Completed -= new Action<Future<T>>(Factor_Completed);

            if (Exceptions.Count > 0)
            {
                Dispose();
                OnCompleted();
                return;
            }

            if (FactorsCompleted.Count == Factors.Count)
                Expression.BeginInvoke(ReceiveCallback, null);
        }

        private void ReceiveCallback(IAsyncResult AsyncResult)
        {
            try
            {
                Value = Expression.EndInvoke(AsyncResult);
            }
            catch (Exception ex)
            {
                Error = ex;
            }

            Dispose();

            // computation is completed, regardless of whether it succeeded or failed
            OnCompleted();
        }

        public void Dispose()
        {
            foreach (Future<T> Factor in Factors)
            {
                Factor.Completed -= new Action<Future<T>>(Factor_Completed);
            }
        }

        // helpful for debugging
        public override string ToString()
        {
            return Name + ", " + (Complete ? "Complete" : "Incomplete") + (Error != null ? ", Error=" + Error.Message : string.Empty);
        }
    }

    public class AggregateException : Exception
    {
        public List<Exception> Exceptions;

        public AggregateException(IEnumerable<Exception> Exceptions)
        {
            this.Exceptions = new List<Exception>(Exceptions);
        }

        public override string Message
        {
            get
            {
                string message = string.Empty;
                foreach (Exception ex in Exceptions)
                {
                    message += ex.Message + "\n";
                }
                return message;
            }
        }
    }
}

Posted in Algorithms, Concurrency, Data Structures, Design Patterns, Functional Programming | Leave a Comment »

Introduction to Port-Based Asynchronous Messaging

Posted by Dan Vanderboom on April 21, 2008

…using terminology and examples from the Concurrency & Coordination Runtime in C#

 

Port-Based Messaging

Port-based messaging has been in practice for a long time.  In the realm of low-level electronics, components have always been operating in parallel, hardware interface ports are designed around standards, and messages are posted to those ports, queuing up somewhere until the receiver is ready to process them.  This pattern has worked extremely well in multiple domains for a long time, and its key characteristic is the decoupling of data flow from control flow.

In sequential programs, one statement runs after another each time it’s run, and the behavior is predictable and repeatable. Concurrency is difficult because you have to consider all possible interleavings of multiple simultaneous tasks, so the overall behavior is nondeterministic. Depending on the relative timings of concurrent tasks, you could get different results each time if you’re not careful to set the appropriate locks on shared resources.  Port-based messaging architectures isolate islands of state across different execution contexts, and connect them with safe, atomic messages delivered through pre-defined ports.

Basic port-based asynchronous messaging

The posting of a message to a port, as shown in Figure 1, is followed by some handler method that is receiving and processing messages.  What’s not evident in the diagram, however, is that while data flows into the port, that posting is a non-blocking call.  The sender continues on doing other work, taking the time only to queue up a message somewhere.

Queuing is important because, even with large thread pools, we can’t guaranty that a receiver will be listening at the very moment a message arrives.  Letting them queue up means the receiver doesn’t have to block on a thread to wait.  Instead, the data waits and the thread checks messages on the port when it can.

Port showing message queue

What is a Port?

So what exactly is a port?  A port is a communication end point, but not in the sense of “a web service on a physical server”.  Think much more fine grained than that, even more fine-grained than methods.  With sequential programming, we commonly use try-catch blocks and handle both the exceptional and non-exceptional results of operations within a single method.  In port-based programming, you post a message to a port, which results in some handler method running on the receiving end, and different results can be sent back on different callback ports depending on the type of message.  Instead of calling a method that returns back to you when it ends, port-based programming is about always moving forward, and unwinding a call stack has very little meaning here.

Sequence diagram of port-based messaging

We can see in the sequence diagram above (Figure 3) a collection of services that communicate with and depend on each other.  Starting from the top, the left-most service posts a message to port 1, and then goes on to do other work or surrenders its thread back to the dispatcher for other tasks that are waiting to run.  A registered method on port 1 runs, and the logic there needs another service to complete it’s task, so it posts a message on port 2, and also continues processing without waiting.  The path of solid blue arrow lines traces the message path for normal execution.  If anything goes wrong, an exception can be posted to a different callback port, shown with a red outline in the diagram.

This diagram shows one possible composition of services and data flow.  Port sets, which are simply a collection of related ports, are shown as callback receivers in pairs, but they can consist of any number of ports with any mixture of messages types, depending on the needs of the system being coordinated.  In this example, if anything goes wrong in the handler methods at ports 2, 5, or 6, an exception message will be routed to port 6, where another handler method can compensate for or report on the error.  Also note that while during startup this system may have to process data at port 4 before the logic at ports 5, 7, and 8 can run… once it gets going, there could be activity operating at many ports concurrently (not just one port per service).

Arbiters, Dispatchers, and DispatcherQueues

Now it’s time to peel away some of the layers of simplification presented so far.  (It may help to have a beer or glass of wine at this point.)

An arbiter is a rule (or set of rules) about when and how to process messages for a specific port (or set of ports).  (It is helpful to think of arbiter as a data flow or message flow coordinator.)  Should messages be pulled off the queue as soon as they arrive?  Should the software wait until 5 messages have arrived before processing them all as a group?  Should messages be checked according to a timer firing every 20 seconds?  Should logic be run only when two ports have messages waiting (a join)?  What logic should be run when one of these conditions occurs?  Can method handlers on three specific ports run concurrently until a message arrives on a fourth port, whose handler must run exclusively, and when done the other three can run again (interleave)?  These are just a few of the many coordination patterns that can be expressed with different types of arbiters (and hierarchically nested arbiters, which are ingenious).

Arbiters, Dispatchers, and DispatcherQueues

Figure 4 illustrates that an arbiter is associated with a port to monitor and a method to execute under the right conditions.  The logic of the arbiter, depending on its type and configuration, determines whether to handle the message.  It gets its thread from a thread dispatcher, which contains a thread pool.  (Not the same as System.Threading.ThreadPool, though, as there can only be one of those per process.)

The next diagram (figure 5) could represent a join coordination.  An arbiter waits for messages on two ports, and depending on how it’s defined, it may process messages from one port repeatedly, but as soon as it receives a message on the second port (it may be an exception port, for example), the whole arbiter might tear itself down so that no more handling on those port will occur.  As you are probably starting to see, composition and attachment of arbiters are key to controlling specific patterns of coordination in arbitrarily powerful and flexible ways.

Multiple DispatcherQueues and complex Arbiters.

In the Concurrency & Coordination Runtime, we can attach and detach these arbiters during runtime; we don’t have to define them statically at compile time.  There has been some criticism towards this approach because dynamic arbitration rules are much more difficult to verify formally with analysis, and are therefore difficult to optimize compilation and thread management for, but the advantages of this flexibility are enormous and the performance (see this paper by Chrystanthakopoulos and Singh) has been very impressive compared to conventional multithreading approaches.  Ultimately, it’s not about whether we can guaranty 100% that nothing will go wrong using only the mathematical models currently in our repertoire, but whether we can be productive with these techniques to release software that meets acceptable quality standards across a broad range of application domains.

I don’t think we’re going to find a better set of strategies to work with anytime soon, and when we’ve pushed this technology hard enough, the tactics will be fine tuned and we can start absorbing some of these coordination concepts into languages themselves (without sacrificing the dynamism that a library of composable parts provides).  People are going to attempt concurrent programming whether it’s safe or not, and using a library such as the CCR significantly reduces the risk of ad hoc multi-threading code.

Cooperative Multitasking

When mainstream operating systems like Windows took their first steps to support multi-tasking, cooperative versus preemptive multi-tasking was a common topic.  The idea of an operating system depending on applications to surrender control in a well-behaved way was generally and rightfully considered a bad idea.  Any kind of error or instability in software could easily bring down the entire operating system, and enforcing a quickly growing community of software vendors to share nicely wasn’t a realistic option.  Being preemptive meant that the OS could forcefully stop an application from running after giving it a small, measured slice of time, and then switch the thread to a new context where another application could run for another time slice.  Regardless of how poorly applications ran, as benevolent dictator, the OS could ensure a fair scheduling of processor time.

The solution encapsulated in the Concurrency & Coordination Runtime is, on the other hand, a cooperative multi-tasking strategy.  However, because it operates within the local scope of an individual process, and is isolated from other processes in the same OS, its risk of destabilizing the system is nonexistent.  This deep level of cooperation, in fact, is what gives the CCR its great performance.  When used correctly, which George Chrysanthakopoulos (in this video) and his colleagues have brilliantly put within our reach in the CCR library, threads don’t sit around waiting on some resource or for long-running operations to complete; instead, control is freely surrendered back to the thread pool, where it is quickly assigned to a new task.

Finally, by surrendering threads freely instead of holding onto them, a continuous flow of threads through the different activities of the system is maintained, and there is therefore always an abundance of them to handle new messages waiting on ports.  Existing threads are not wasted.  As the Tao Te Ching says:

If you want to accord with the Tao,
just do your job, then let go.

Control & Data Flow: Sequential vs. Concurrent

In sequential programs, stacks are used to unwind method calls and provide return values (return messages), and threads follow the data flow; whereas in port-based programming, threads are managed by one or more thread dispatchers that are capable of maximizing the use of that thread by making it available in a pool and sharing it with with many other (potentially unrelated) tasks.  Data flows orthogonally and according to a different coordination strategy than control flow.  This task-thread agnosticism (the opposite of thread-affinity) is similar to the statelessness of a web server such as IIS; one or more threads from a large pool are injected into the tasks of processing, rendering, and serving up huge numbers of web pages, after which those threads are recycled back into the thread pool for execution of other tasks for a highly concurrent and scalable service platform.

So herein lies the trick: in order to split this coupling between data flow and control flow, a different means is needed to compose the two coordination strategies.  In C# and other popular imperative programming languages, methods implicitly pass thread control along with data arguments (the message), and the use of the stack for method calls asserts constraints on control flow, so making the CCR work involves some interesting patterns.

That’s why port-based programming is hard to get your head around.  It’s such a large shift from common sequential logic and it requires some additional planning (and good visualizations).  It’s obviously important to have a good set of tools for expressing that coordination, a simple set of conceptual primitives that allows us to compose arbitrarily-complex coordination patterns.  These primitives, including Message, Port, PortSet, Dispatcher (thread pool), and others provide the building blocks that we can use to define these relationships.  Once we define what we want to happen with these building blocks, the CCR can make it all happen.

This level of coordination is a level beyond the strategies used by most concurrent applications and frameworks in software today, primarily because there hasn’t been a pressing need for it until recently–processors had been growing phenomenally in speed for many years.  Now, however, we’re told that processor speed has plateaued, that we now have to scale out to scale up, spreading the processing load across multiple cores.  We are very fortunate that the work being done by researchers in fields like robotics can be applied in other service oriented architectures, and is a testament to the widespread use of the .NET Framework and the fantastic efforts of some very bright individuals.

Where to Find Microsoft Robotics Studio

Robotics Studio is a free download and can be found here, and while it is full of other good stuff, it’s worth checking out for the Concurrency and Coordination Runtime alone.

Posted in Compact Framework, Concurrency, Data Structures, Design Patterns, Microsoft Robotics Studio, Object Oriented Design, Service Oriented Architecture, Software Architecture | Leave a Comment »

The Visitor Design Pattern in C# 3.0

Posted by Dan Vanderboom on April 9, 2008

I use many common design patterns on a regular basis–composite, MVC/MVP, adapter, strategy, factory, chain of command, etc.–but I’ve never come across a situation where I felt Visitor in the classic definition (GoF) made sense.  I had read about it, but the necessity of defining the interfaces for not only the Visitor classes (that’s not so bad) but also the elements being visited, makes it seem overly complex and therefore tainted for me.  What if you don’t own the source code to the elements, and don’t want to inherit from existing types (if they’re not sealed) just to implement an IVisitedElement interface?

I wanted a less intrusive way of visiting any set of objects, without making any special demands on or assumptions about their types, and I suspected that new features in C# 3.0 would provide a way to make it elegant and terse.  What’s needed in essence is to visit each object in a collection with a common function or object, and to perform some action, transform the object in some way, and/or calculate some end result (usually an aggregation).  Can we do that without having to implement special interfaces or disrupting the code model in place?

For the sake of completeness and to serve as a baseline for other implementations, I’ll show you what the classic Visitor pattern looks like.

UML for Visitor Design Pattern

Here is the code that corresponds to this diagram.

interface IEmployeeVisitor
{
    void Visit(Employee employee);
    void Visit(Manager manager);
}

interface IVisitorElement
{
    void Accept(IEmployeeVisitor Visitor);
}

class EmployeeCollection : List<Employee>
{
    public void Accept(IEmployeeVisitor Visitor)
    {
        foreach (Employee employee in this)
        {
            employee.Accept(Visitor);
        }
    }
}

class Employee : IVisitorElement
{
    public decimal Income;
    public Employee(decimal income) { Income = income; }

    public virtual void Accept(IEmployeeVisitor Visitor)
    {
        Visitor.Visit(this);
    }
}

class Manager : Employee, IVisitorElement
{
    public decimal Bonus;
    public Manager(decimal income, decimal bonus) : base(income) { Bonus = bonus; }

    public override void Accept(IEmployeeVisitor Visitor)
    {
        Visitor.Visit(this);
    }
}

class SumIncomeVisitor : IEmployeeVisitor
{
    public decimal TotalIncome = 0;
    public void Visit(Employee employee) { TotalIncome += employee.Income; }
    public void Visit(Manager manager) { TotalIncome += manager.Income + manager.Bonus; }
}

class Program
{
    static void Main()
    {
        EmployeeCollection employees = new EmployeeCollection();
        employees.Add(new Employee(100000));
        employees.Add(new Employee(125000));
        employees.Add(new Manager(210000, 35000));

        SumIncomeVisitor visitor = new SumIncomeVisitor();
        employees.Accept(visitor);
        decimal result = visitor.TotalIncome;

        Console.WriteLine(result);
        Console.ReadLine();
    }
}

 
The first major disadvantage is the amount of plumbing that must be in place, and the two-way dependencies created, between visitors and the objects to be visited.  Though specific types aren’t hard-coded, a conceptual two-way dependency implied by the interfaces’ knowledge of each other requires forethought and special accomodations on both sides from the beginning.  Management of dependencies is always important; how well we do so determines how applications become more complex as they grow.  So whenever possible I ensure that dependencies run in one direction.  This creates natural segmentation and layering, and ensures that components can be pulled apart from each other rather than congealing into something like a tangled ball of christmas tree lights.

Instead of passing a collection of Employee objects to some calculating Visitor, we tell the Employee to accept a Visitor object, which then just turns around and calls the Visitor.  That by itself seems rather indirect and convoluted.  Visiting a single element isn’t very exciting.  Nothing very interesting happens until you have a whole bunch of things to work with.  So in order to visit a collection, a custom collection type is defined with an Accept method that in turn calls Accept on each Employee.  This custom collection is yet another type we’re required to write when otherwise a List<Customer> or something similar would suffice.  And what happens when your data structure is something other than a basic list?  What if you have a tree of objects you’d like to visit?  Would you then have to implement a tree data structure that is visitor friendly?  How many aggregation types do you want to reinvent with visitation specifically in mind?

The rest of it isn’t so bad.  The SumIncomeVisitor class contains both the processing logic and state for any calculations needed by that Visitor.  One of these is instantiated (another extra step), passed to the collection’s Accept method, and therefore executed against all employees in the collection.  After all objects are visited, the SumIncomeVisitor object contains the final result.  This all works, but seems pretty klunky.  Perhaps the pattern is more interesting if IVisitorElement classes provide more sophisticated Accept implementations.  I can’t think of any examples off-hand but I’ll be thinking about and looking for these.

The code above is just shy of 80 lines long.  Can we accomplish exactly the same goal with less code, more simply and clearly?

class Employee
{
    public decimal Income;
    public Employee(decimal income) { Income = income; }
}

class Manager : Employee
{
    public decimal Bonus;
    public Manager(decimal income, decimal bonus) : base(income) { Bonus = bonus; }
}

class Program
{
    static void Main()
    {
        List<Employee> employees = new List<Employee>();
        employees.Add(new Employee(100000));
        employees.Add(new Employee(125000));
        employees.Add(new Manager(210000, 35000));

        decimal TotalIncome = 0;
        employees.ForEach(e => SumEmployeeIncome(e, ref TotalIncome));

        Console.WriteLine(TotalIncome);
        Console.ReadLine();
    }

    static void SumEmployeeIncome(Employee employee, ref decimal TotalIncome)
    {
        TotalIncome += employee.Income;

        if (employee is Manager)
            TotalIncome += (employee as Manager).Bonus;
    }
}

 
In this code, you’ll notice a few simplifications:
  1. There are no IVisitorElement or IEmployeeVisitor interfaces.
  2. Employee and Manager types exist without any knowledge of or explicit support for being visited.
  3. No custom collection is required, so a basic List<Employee> is used.

In order to make this work, we need the same basic things that we needed before: visiting/processing logic, and a place to store state for that processing.  In the second approach, the state is stored in the TotalIncome variable within the Main method, where the calculation is being requested, and the processing logic kept in another method of the same class.  I could have declared TotalIncome as a class variable, but I’d really like to restrict any “scratch pad” data used in a calculation to have as restricted a scope as possible.  In the classic Visitor pattern, the data is encapsulated with the processing logic.  By calling a method with a secondary ref parameter, I can declare TotalIncome within the Main method and avoid cluttering the class definition with data that’s only relevant to one method’s logic.  This is a lighter-weight, more in-line approach than defining separate types and having to instantiate a Visitor object (Visitor Object vs. Visitor Method).

The actual mechanism for visiting every object is the ForEach method.  The List<T> class includes a very useful ForEach method that allows you to pass in an Action<T> delegate to execute a method for each element.  ForEach can’t take a method with our second ref parameter; it can only accept an Action<T> delegate.  The lambda expression e => SumEmployeeIncome(e, ref TotalIncome) creates an anonymous method that does in fact match Action<T>.  The parameter e is of type Employee because the employees collection is List<Employee>, which means the Employee type is inferred for Action<Employee>.  The anonymous method represented by the lambda then calls SumEmployeeIncome, passing the Employee e object through as well as the TotalIncome state to be transformed on successive calls for each Employee.

Finally, SumEmployeeIncome acts as the Visitor.  Different logic can be performed for different types where inheritance is involved, as it is with this sample, by testing for types using the is operator.  This is in contrast to the dual Visit methods taking Employee and Manager types respectively.  Actually, the classic Visitor pattern could have used the same approach in this regard.
 
Where more complex state is needed for processing, a new Visitor-state type could be created to support the processing, and by using an object for this purpose, it wouldn’t be necessary to declare or pass the parameter by reference.  Another option would simply be to declare multiple ref parameters.
 

The List.ForEach method is awfully nice, but what if you’re working with another data structure, such as an array, an ArrayList, a LinkedList<T>, or even a Tree<T>?  Defining a simple extension method can provide this tool regardless of what kind of collection you’re working with.

public static void ForEach<T>(this IEnumerable<T> collection, Action<T> action)
{
    foreach (T item in collection)
    {
        if (action != null)
            action(item);
    }
}

That’s better.  Now if only that extension method had been defined in the first place, the specific one in the List<T> class wouldn’t be necessary.

There’s an even more succinct way to accomplish the specific example above, using the Sum extension method on IEnumerable<T>.

TotalIncome = employees.Sum(e => (e is Manager) ? (e as Manager).Bonus + e.Income : e.Income);

I don’t mind writing or reading code like this, and as more functional programming constructs are merged into C#, I think it’s important to flex these mental muscles and become familiar with the syntax, but one might argue that this is a little more cryptic than the preceding example.  If the calculation was any more complicated, it would make sense to use a statement lambda with curly braces instead of the shorter expression lambda shown above.  Here’s one way it could be written as a statement lambda:

TotalIncome = employees.Sum(e =>
    {
        decimal result = e.Income;

        if (e is Manager)
            result += (e as Manager).Bonus;

        return result;
    });

You can see that there is more opportunity here to perform other actions and participate in more complex calculations.  This approach is even lighter-weight than the second approach suggested above using a separate named method and external state passed by reference.  The approach you take should depend on the needs and constraints of the situation.  Lighter-weight approaches are good for ad-hoc processing, whereas the heavier approaches make more sense if the visiting logic needs to be reused in other places.

If we don’t need to share state across visitations of objects in a collection, we could simply use extension methods, which is the simplest option of all.  After all, the original intent of the Visitor pattern was to allow us to add functionality to a class without modifying the original element’s type.  According to dofactory.com:

Represent an operation to be performed on the elements of an object structure. Visitor lets you define a new operation without changing the classes of the elements on which it operates.

Extension methods “add” functionality to existing classes, or at least create a compelling illusion that they do.  Just reference an assembly and import the right namespace to add the operations.  It is possible to share state, but only by doing something like passing in some shared state object to each call.

If C# had the ability to define a variable that would be static to a defined closure in a method (which it doesn’t), we could use extension methods all the time without any drawbacks.  (I miss the static local variable feature of Turbo Pascal.)

Conclusion

With the use of lambda expressions and extension methods, we’ve been able to cut the amount of code for the Visitor pattern by more than half and found that there was no need to specially prepare the data model classes to support visitation.  While the classic Visitor pattern may have more potential in complex and custom Accept scenarios, in general the need to visit elements of a collection can be better accomplished with the judicious use of available language features in C# than by blindly following classic design patterns without consideration for how relevant they really are.

While I certainly encourage developers to become familiar with common patterns, I also encourage them to think carefully about the code they’re writing, and to ask themselves if it’s as clear and simple as it can be.  As software systems grow in size–sometimes becoming victims of their own success–small inefficiencies and muddied designs can snowball into unmanageability.  Apply a simple solution first, and then add complexity only when necessary.

Doubt and ask why constantly.  Be educated and familiar with the literature, but don’t dogmatically accept everything you read: think for yourself and hone your skills at every opportunity.  While the goals and forces in software tend to remain constant over time, the forms that made sense years ago may become unnecessary with today’s tools.

Posted in Algorithms, Data Structures, Design Patterns, Object Oriented Design, Problem Modeling, Software Architecture | 5 Comments »

Tree Data Structure Source Code Posted

Posted by Dan Vanderboom on April 3, 2008

For those of you who have been waiting for the source code to the Tree<T> article I wrote, I’m happy to announce that it’s finally available.  I had originally intended to supply some non-trivial examples of its use, but with my super busy schedule at work and otherwise, I’ve had to reduce the scope just to get it out there.  One of my examples is large enough to warrant one or more large articles, so I also didn’t want to just toss them in there without sufficient explanation as to how it all works.

While I work on getting those ready, check out Damon Payne’s article on using a non-binary tree for modeling dependencies among concurrently-running tasks with the Task Parallel Library.  This is a great use of the tree data structure.  It would be interesting to see how that would work for tasks with cross-branch dependencies.  Does the tree become a graph?  Would iteration become a garbage-collection-like traversal?  How would it need to respond to insertion of tasks or modification or dependencies during execution?

My non-binary tree article, which has been updated with a short section just before the conclusion, can be found here.  At the very top of the article is a link to the source code in the form of a Visual Studio 2008 solution.  For those of you with VS2005, you should be able to easily extract the code and create a VS2005 project out of it.  I specifically targeted .NET 2.0 instead of 3.5, and originally tested it against Compact Framework.

I’ll also be doing further development of this library, so if you have any cool feature requests, let me know.

Posted in Algorithms, Concurrency, Data Structures, My Software, Object Oriented Design, Problem Modeling, Software Architecture | 6 Comments »

Tree<T>: Implementing a Non-Binary Tree in C#

Posted by Dan Vanderboom on March 15, 2008

Download the Source

[This is the first article in a series of intelligent data structures, which is continued here with KeyedList.]

I’ve always thought it was odd that the .NET Framework never shipped with a Tree or Tree<T> class in its collection namespaces.  Most of the other classic data structures are there: List, Dictionary, Stack, Queue, and so on.  Where then is Tree<T>?  I have no idea, but finding myself in need of one, I decided to build one, and in doing so realized that it was a little trickier than I first imagined it would be.  Certainly it isn’t difficult to get some kind of tree working, even without support for generics, but building one that is truly intuitive to use was a real challenge.  In this article I will share what I came up with, and will illustrate the thought process behind each of the design decisions that I made as a study in object-oriented design and the use of some cool C# language features like generics constraints.

First, what is a tree data structure and what is it used for?  A tree is, in layman’s terms, a collection of nodes (items) that are connected to each other in such a way that no cycles (circular relationships) are allowed.  Examples abound in organization charts, family trees, biological taxonomies, and so forth.  Trees are incredibly useful structures because they allow us to represent hierarchies and to compose complex structures out of basic parts and simple relationships.  In the world of programming specifically, this provides a mechanism for extensibility across many domains, and a container to assist us in searching, sorting, compressing, and otherwise processing data in efficient and sophisticated ways.  Trees are also a natural fit for use in the composite design pattern, where a group of objects can be treated (or processed) like a single object.

Just as it’s possible to implement recursive algorithms non-recursively, it’s also possible to create non-hierarchical data structures to store data that would more logically be stored in a tree.  In these cases, your code is typically much more clear when organized in a tree, leading to greater maintainability and refactorability.  Recursion itself, while not exceptionally common in every day programming tasks, plays a more important role in some interesting ways with trees and traversal patterns.  Traversal is the term used to describe how nodes in a tree are visited.  Later I’ll demonstrate some tree traversal techniques that you can use.

Binary vs. Non-Binary Trees

What are binary trees, and why the focus on non-binary trees?  Binary trees are a special kind of tree in which each node can only have two children, often referred to as left and right child nodes.  These special trees are very useful for sophisticated searching, sorting, and other algorithms.  However, trees that allow any number of children seem to abound in more general, every-day programming scenarios, as you’ll see from examples below.  In a follow-up article, I’ll demonstrate how you can create a BinaryTree<T> class that inherits from Tree<T> and applies the necessary restrictions and some useful abstractions.  Below (figure 1) is a binary tree, which I present here to contrast with the non-binary trees that we’ll be covering in detail.

BinaryTree

Figure 1. A generic binary tree.

For more information on implementing and using binary trees, see this article which is part of a larger series on data structures.

 

Examples of Non-Binary Trees

A simple example of a non-binary tree is the file system in any modern operating system.  The diagram below (figure 2) illustrates a small part of the structure of the Windows file system.  Folders can be nested inside other folders as deeply as needed, providing the ability to compose a complex organizational structure for storing files.  The traversal path—from the root to some other node—can be represented by a canonical resource descriptor such as C:\Windows\system32.

 FileSystemTree

Figure 2. The file system is a tree structure of folders containing other folders.

 

This hierarchical structure can be represented and visualized in different ways, though the underlying relationships are the same.  The screenshot below (figure 3) is of Windows Explorer on my development computer.  The control is the common TreeView, which supplies a way in this case for users to explore and interact with the tree data structure of the file system.

FileSystemTreeView

Figure 3. A tree view control in Windows Explorer provides access to a hierarchy of nested folders.

 

Another example is the organization of controls in a form (for Windows Forms) or in a window (for WPF).  The next diagram (figure 4) depicts the relationship of controls containing child controls, which may in turn contain their own children, and so on.

AppWindowTree2

Figure 4. A user interface is composed of a tree of controls that contain other controls.

 

This is also manifested through a user interface with the Document Outline window in Visual Studio, which is very useful for selecting deeply nested controls, or container controls that are otherwise difficult to select in the forms designer itself.  This is shown in figure 5, where you can clearly see the different levels of all controls.

DocumentOutline

Figure 5. The document outline window in Visual Studio for a Windows Forms screen.

 

Defining the Basic Components

There is a lot of formal terminology in graph theory to contend with, but for our purposes, we only really need to be concerned with a few basic terms.

  • Tree – This refers to a collection of nodes connected by parent-child relationships in a hierarchical structure.
  • Parent – A node one level up from the current node.
  • Child – A node one level down from the current node.
  • Node – One item in a tree with an optional parent and zero or more children.
  • Root Node – A special node with no parent.  A tree can have only one root node.
  • Subtree – A section of a larger tree including the non-root node of a larger tree, plus all of its children.

That’s not so bad, is it?  In designing a reusable tree data structure, it’s important to establish a consistent and sensible pattern for semantics.  In fact, coming up with good names for the components of our data structure may be the most difficult part.  With poor identifiers, even simple structures can be confusing to those using it.

 

Begin with the End in Mind

I began with a quasi-test-driven development approach.  I asked myself how I’d like the end result to look when consuming the new data structure.  I knew from the start that a tree data structure that didn’t use generics wasn’t going to be very useful.  Imagine this code:

Tree

ObjectTree;

What is the type of each node in the tree?  If I’m going to use this in different scenarios, the only answer is to make it an object, so that I could store whatever I liked in my tree, but this would require a lot of casting and a lack of type checking at compile type.  So instead, using generics, I could define something like this:

Tree

<string> StringTree;

This is much better, and leads to two more steps that we’ll need to take conceptually.  The first one is that I will definitely want a tree of nodes for custom types in my software’s problem domain, perhaps Customer or MobileDevice objects.  Like strings, these objects are (for our purposes here) simply dumb containers of data in the sense that they are unaware of the tree structure in which they reside.  If we take this one level further, and consider custom types that are aware of their place within the tree (and can therefore particpate in much richer ways to compose hierarchical algorithms), we’ll need to consider how to make that awareness happen.  I will explain this in more detail later in this article.

public class Tree<T> : TreeNode<T>

{

    public Tree() { }

 

    public Tree(T RootValue)

    {

        Value = RootValue;

    }

}

This definition of Tree is really a matter of semantics and syntax preference.  I’m creating an alias for the TreeNode type, claiming that a Tree is a node itself—the root node in a Tree, by convention.  I call this a synonym typeHere’s a very simple example of its use:

Tree<string> root = new Tree<string>();

root.Value = "zero";

int d0 = zero.Depth;

TreeNode<string> one = zero.Children.Add("one");

int d1 = one.Depth;

TreeNode<string> twoa = one.Children.Add("two-a");

TreeNode<string> twob = one.Children.Add("two-b");

TreeNode<string> twoc = one.Children.Add("two-c");

string twocstr = twoc.Value;

int

d2 = two.Depth;

You can tell a few things by looking at this code:

  • The root node is defined as a Tree, but is manipulated like the other TreeNodes because it inherits from TreeNode.
  • Each node’s value is stored in a property called Value, which has a type of T (using generics).
  • A Depth property indicates how deeply in the tree the node is nested (the root has a depth of 0).
  • The Add method in the Children TreeNode collection returns a TreeNode object, making it easier to create a new node and get a handle to it in the same statement.

 

Connecting Nodes to Their Parents & Children

The intelligence needed to make a Tree work resides in two classes: TreeNode<T> and TreeNodeList<T>.  I could have used a standard collection type for TreeNodeList, but each TreeNode links to both Parent and Children nodes, which are two fundamentally different relationships; and I wanted parent and child nodes to connect and disconnect automatically behind-the-scenes—if you add a child to X, that child’s Parent property should automatically be set to X, and vice versa.  That requires hooking into the Add method in a custom collection class to manage those relationships.  TreeNodeList<T> therefore looks like this:

public class TreeNodeList<T> : List<TreeNode<T>>

{

    public TreeNode<T> Parent;

 

    public TreeNodeList(TreeNode<T> Parent)

    {

        this.Parent = Parent;

    }

 

    public new TreeNode<T> Add(TreeNode<T> Node)

    {

        base.Add(Node);

        Node.Parent = Parent;

        return Node;

    }

 

    public TreeNode<T> Add(T Value)

    {

        return Add(new TreeNode<T>(Value));

    }

 

    public override string ToString()

    {

        return "Count=" + Count.ToString();

    }

}

The ToString override is very important for making your debugging sessions more manageable.  Without this kind of assistance, especially when troubleshooting recursive, hierarchical algorithms, you may go crazy digging through the debugger’s own tree view control to find what you’re looking for.

 

The Tree Node

The TreeNode itself gets a little tricky.  If you update the Parent property because a node is going to be moved to another part of the tree, for example, you want to make sure that it gets removed from the Parent’s list of Children, and you also want to add it as a child to the new Parent.  If the Parent was null, or is being set to null, then only one of those operations (remove or add) is necessary.

Here are the primary structural and payload-containing portions of this early version of the TreeNode class:

public class TreeNode<T> : IDisposable

{

    private TreeNode<T> _Parent;

    public TreeNode<T> Parent

    {

        get { return _Parent; }

        set

        {

            if (value == _Parent)

            {

                return;

            }

 

            if (_Parent != null)

            {

                _Parent.Children.Remove(this);

            }

 

            if (value != null && !value.Children.Contains(this))

            {

                value.Children.Add(this);

            }

 

            _Parent = value;

        }

    }

 

    public TreeNode<T> Root

    {

        get

        {

            //return (Parent == null) ? this : Parent.Root;

 

            TreeNode<T> node = this;

            while (node.Parent != null)

            {

                node = node.Parent;

            }

            return node;

        }

    }

 

    private TreeNodeList<T> _Children;

    public TreeNodeList<T> Children

    {

        get { return _Children; }

        private set { _Children = value; }

    }

 

    private T _Value;

    public T Value

    {

        get { return _Value; }

        set

        {

            _Value = value;

 

            if (_Value != null && _Value is ITreeNodeAware<T>)

            {

                (_Value as ITreeNodeAware<T>).Node = this;

            }

        }

    }

}

There are two ways we could find the Root node.  The commented line shows a succinct way to walk the tree (recursively) toward successive parents until Parent is null.  The actual implementation being used shows another way, using a simple while loop.  I prefer this because in the event of debugging, it’s easy to step through a loop, and a little more difficult to jump through the same property on multiple, perhaps many, different instances of TreeNode.  I follow the same pattern for the Depth property (below).

A tree structure isn’t very useful, however, unless it can carry some kind of payload.  You want to build a tree of something, and it’s handy that we can use generics to tell the compiler that we want a Tree of strings (Tree<string>), for example.  That’s what the Value property is for, and why its type is the generic type parameter T.

To instantiate and initialize TreeNodes, you’ll need some constructors.  Here are two of them I defined:

public TreeNode(T Value)

{

    this.Value = Value;

    Parent = null;

    Children = new TreeNodeList<T>(this);

}

 

public TreeNode(T Value, TreeNode<T> Parent)

{

    this.Value = Value;

    this.Parent = Parent;

    Children = new TreeNodeList<T>(this);

}

 

The Tree Node Payload’s Awareness of its Place in the Tree

You probably noticed in the last section that the Value object is checked to see if it implements the ITreeNodeAware<T> interface.  This is an optional extensibility mechanism for custom classes that need to be aware of the tree so that payload objects can read or manipulate it in some way.  In developing a data binding framework for Windows Forms that allows you to bind control properties to paths (“PurchaseOrder.Customer.Name”) instead of specific objects (PurchaseOrder.Customer, “Name”), as ASP.NET and WPF data binding works, I needed this ability and came to the conclusion that this would be a useful feature in general.  Later in the article, I will magically transform the TreeNode and TreeNodeList classes in such a way that both this interface and the Value property become unnecessary.

Until then, here’s how the interface looks with an example class that uses it.

public interface ITreeNodeAware<T>

{

    TreeNode<T> Node { get; set; }

}

 

public class Task : ITreeNodeAware<Task>

{

    public bool Complete = false;

 

    private TreeNode<Task> _Node;

    public TreeNode<Task> Node

    {

        get { return _Node; }

        set

        {

            _Node = value;

 

            // do something when the Node changes

            // if non-null, maybe we can do some setup

        }

    }

 

    // recursive

    public void MarkComplete()

    {

        // mark all children, and their children, etc., complete

        foreach (TreeNode<Task> ChildTreeNode in Node.Children)

        {

            ChildTreeNode.Value.MarkComplete();

        }

 

        // now that all decendents are complete, mark this task complete

        Complete = true;

    }

}

Using the ITreeNodeAware<T> interface means we have another step to make in our implementation, and adds some complexity to its use in terms of discoverability and implementation of the interface by consumers of the Tree structure in creating custom payload classes.  By doing this, however, our Task objects will get injected with a Node property value when added to a Tree of Tasks.  So the payload object will point to the node via the Node property, and the Node will point to payload object via its Value property.  This is a lot of logic for such a simple relationship, but as we’ll see later, there is an elegant way around all of this.

 

Including Structure Helper Members

There are some common measurements of aspects of the tree’s nodes, as well as operations that you will typically want to perform on a tree or subtree, such as initialization, systematic traversal, pruning and grafting, disposal, and determination of depth, some of which I will discuss here.

Here is a property to determine your current nesting depth, which can be useful while debugging:

public int Depth

{

    get

    {

        //return (Parent == null ? -1 : Parent.Depth) + 1;

 

        int depth = 0;

        TreeNode<T> node = this;

        while (node.Parent != null)

        {

            node = node.Parent;

            depth++;

        }

        return depth;

    }

}

 

Because the payload objects (referenced by the Value property) may require disposing, the tree nodes (and therefore the tree as a whole) is IDisposable.  Different trees of objects may require being disposed in different orders, so I’ve created a TreeTraversalType, a DisposeTraversal property of this type to specify the order, and have implemented the Dispose method that takes this into consideration.

 

public enum TreeTraversalType

{

    TopDown,

    BottomUp

}

 

private TreeTraversalType _DisposeTraversal = TreeTraversalType.BottomUp;

public TreeTraversalType DisposeTraversal

{

    get { return _DisposeTraversal; }

    set { _DisposeTraversal = value; }

}

Here is one way to implement IDisposable that includes a property indicating whether a node has been disposed, invokes a Disposing event, and traverses the tree according to the value of DisposeTraversal.

private bool _IsDisposed;

public bool IsDisposed

{

    get { return _IsDisposed; }

}

 

public void Dispose()

{

    CheckDisposed();

    OnDisposing();

 

    // clean up contained objects (in Value property)

    if (Value is IDisposable)

    {

        if (DisposeTraversal == TreeTraversalType.BottomUp)

        {

            foreach (TreeNode<T> node in Children)

            {

                node.Dispose();

            }

        }

 

        (Value as IDisposable).Dispose();

 

        if (DisposeTraversal == TreeTraversalType.TopDown)

        {

            foreach (TreeNode<T> node in Children)

            {

                node.Dispose();

            }

        }

    }

  

    _IsDisposed = true;

}

 

public event EventHandler Disposing;

 

protected void OnDisposing()

{

    if (Disposing != null)

    {

        Disposing(this, EventArgs.Empty);

    }

}

 

public void CheckDisposed()

{

    if (IsDisposed)

    {

        throw new ObjectDisposedException(GetType().Name);

    }

}

 

I overrode ToString in the TreeNodeList class above to display the count of children.  I do the same thing for TreeNode, which as I mentioned earlier aids a great deal in debugging.

public override string ToString()

{

    string Description = string.Empty;

    if (Value != null)

    {

        Description = "[" + Value.ToString() + "] ";

    }

 

    return Description + "Depth=" + Depth.ToString() + ", Children="

      + Children.Count.ToString();

}

Notice how the Value property, if it’s set, gets included in the ToString value.  If you’re looking at a TreeNode<Thing> in the watch window, you’ll appreciate that your Value object can be represented without having to drill into anything.  You can see at a glance what the payload is, how deep it is in the tree, and how many child nodes it has.

 

The Role of Generics in Representing Payloads

I have already supplied enough logic and sample code for a fully functioning tree data structure for use in the .NET Framework, which incidentally was developed in a Compact Framework project and will therefore work in any .NET environment.  That being said, there are some syntactical inconveniences with the approach mentioned so far.  Consider the Tree<Task> example above in a scenario where we want to access a parent node’s payload from a current node:

TreeNode<Task> CurrentTaskNode = /* get current task node */;

bool IsComplete = CurrentTaskNode.Parent.Value.Complete;

 

Note that Parent doesn’t return our object of type T, but instead gives us a TreeNode<T>; and our CurrentTaskNode object is also a node object and not a task object.  When we think about trees in theory, especially visually in the form of diagrams, the parent of a node is another node, meaning that parent and child are the same type of thing.  In our simple implementation so far, however, the parent of a task is not another task, but rather a task-tree-node, which is not the same thing.

If we start from a Task object instead of a TreeNode<Task>, the syntax is worse:

Task CurrentTask = /* get current task */;

bool IsComplete = CurrentTask.Node.Parent.Value.Complete;

 

Notice how each time we access Node or Value, we’re weaving in and out between types.  So we must manipulate this data structure and its payload all the while with explicit deference to the type disparity and dualism, and a careful naming of variables is important to avoid confusion between types (CurrentTaskNode vs. CurrentTask).  When I first had this realization, it occurred to me why a Tree data type may have been missing from the original .NET Framework.  I doubt very much that all the brilliant developers working on the base class libraries didn’t think to include such a useful structure, but perhaps the obvious implementations that came to mind seemed confusing and problematic for real-world use.

Fortunately, we now have capabilities in the CLR as of 2.0—and corresponding language features—that enable us to solve this problem elegantly.  I’m referring to generics, and more specifically, to generic constraints.

The simplifying idea is that by creating a payload class that inherits from TreeNode<T>, and applying a simple generic constraint to get around some casting problems that would normally prevent us from compiling, we can make our final syntax look like this:

Task MakeDinner = new Task();

Task PrepareIngredients = MakeDinner.Children.Add(new Task());

Task CookMeal = MakeDinner.Children.Add(new Task());

Task PreheatOven = CookMeal.Children.Add(new Task());

Task BakeAt350 = CookMeal.Children.Add(new Task());

Task Cleanup = MakeDinner.Children.Add(new Task());

bool IsAllDone = BakeAt350.Parent.Parent.Complete;

 

Notice that in the final statement, we don’t have to navigate from BakeAt350 to some different type object through a Node property, and that we can go directly from Parent to the Complete property.  This is because our Task class is defined like this now:

public class Task : TreeNode<Task>

{

    public bool Complete = false;

 

    // recursive

    public void MarkComplete()

    {

        // mark all children, and their children, etc., complete

        foreach (Task Child in Children)

        {

            Child.MarkComplete();

        }

 

        // now that all decendents are complete, mark this task complete

        Complete = true;

    }

}

 

It’s been compressed!  The Node property is no longer necessary (since the task is-a node, instead of being contained-by a node), and therefore the ITreeNodeAware<T> interface can also be dispensed with.  The MarkComplete method is different, too: we simply write Child.MarkComplete instead of Child.Value.MarkComplete, and we can loop through a collection of Task objects with the Children property directly, instead of going through a set of TreeNode<Task> objects in some external Node object.

In our consuming code above, we didn’t even have to mention the Tree or TreeNode type, but if we did declare a variable that way (for the sake of clarity), it would simply be a synonym of Task.  It would look like this:

Tree<Task> MakeDinner = new Tree<Task>();

We could use TreeNode<Task> as well; it makes no difference.  Task = Tree<Task> = TreeNode<Task>.

 

To make all of this magic happen, we need to define TreeNode with a generics constraint.

 

public class TreeNode<T> : IDisposable where T : TreeNode<T>

 

This tells the compiler that T must inherit from TreeNode<T>, and therefore variables of type T are safe to manipulate as tree nodes.  TreeNodeList<T> requires the same constraint:

public class TreeNodeList<T> : List<TreeNode<T>> where T : TreeNode<T>

This breaks us out of a restrictive pattern where normally a custom collection class is unable to richly manipulate the objects within it (unless the collection is instantiated directly from consumer code, which is not the case in this pattern).  But because the collection class knows that all of its items will derive from TreeNode<T>, it can manipulate them as tree nodes with no problem.

There is a price to pay for this convenience, however, which is that the tree is restricted to types that inherit from TreeNode.  This means that a Tree<string> is no longer possible.  You will have to decide how you will likely use trees in your development to determine whether to design something like this into it.  If you want this cleaner syntax but still need trees of primitive and other non-TreeNode-derived types, you can create a distinct SimpleTree<T> for this purpose.  It may even be possible for Tree<T> to inherit from SimpleTree<T>, hiding the Value property (with a private shadow method shown below) and adding the generic constraint.

 

Cleaning Up

Now that the basic functionality is in place, I decided to split the Tree<T> class into two.  SimpleTree<T> represents a simple tree data structure that can contain as its Value any object; and ComplexTree<T>, which uses the generic constraint described above and supports more complex hierarchical algorithms and tree-aware nodes.  I really like the simplicity of the Tree<T> name, but along with the need to support both simple and complex scenarios, there are two more reasons for this name-change decision.

First, in the System.Windows.Forms namespace, there is already a TreeNode class that corresponds to the TreeView control.  If I had designed that control, I probably would have named it VisualTree and its node VisualTreeNode to distinguish it from a logical node, but as it is, dealing with two different TreeNode classes, even in different namespaces, could be confusing and messy.  Second, the new Task Parallel Library (TPL) contains an implementation of a binary tree called Tree<T>, which is a rather short-sighted name considering that all useful trees are not binary trees, as I’ve demonstrated in this article; BinaryTree<T> would have been a much more appropriate name.  Hopefully by the time TPL is released, this identifier will be updated to reflect this.

 

Conclusion

Elegant implementations of tree data structures in .NET Framework, though problematic in the past, are finally possible with the introduction of generics and generic constraints.  With some careful syntax planning for consuming code, as well as experimentation with different language constructs, I hope I have shed some light on how these kinds of problems can be approached and solved.

In future articles, I will be exploring some techniques and specific uses for hierarchical algorithms.  Some of the very problems that appear too difficult to solve without a background in mathmatics can be much more easily understood and mastered by using tree data structures and associated visualization techniques.

[This is the first article in a series of intelligent data structures, which is continued here with KeyedList.] 

Posted in Algorithms, Data Structures, Object Oriented Design, Problem Modeling, Software Architecture | 61 Comments »