Critical Development

Language design, framework development, UI design, robotics and more.

Archive for the ‘Design Patterns’ Category

The Archetype Language (Part 9)

Posted by Dan Vanderboom on October 3, 2010

Overview

This is part of a continuing series of articles about a new .NET language under development called Archetype.  Archetype is a C-style (curly brace) functional, object-oriented (class-based), metaprogramming-capable language with features and syntax borrowed from many languages, as well as some new constructs.  A major design goal is to succinctly and elegantly implement common patterns that normally require a lot of boilerplate code which can be difficult, error-prone, or just plain onerous to write.

You can follow the news and progress on the Archetype compiler on twitter @archetypelang.

Links to the individual articles:

Part 1 – Properties and fields, function syntax, the me keyword

Part 2 – Start function, named and anonymous delegates, delegate duck typing, bindable properties, composite bindings, binding expressions, namespace imports, string concatenation

Part 3 – Exception handling, local variable definition, namespace imports, aliases, iteration (loop, fork-join, while, unless), calling functions and delegates asynchronously, messages

Part 4 – Conditional selection (if), pattern matching, regular expression literals, agents, classes and traits

Part 5 – Type extensions, custom control structures

Part 6 – If expressions, enumerations, nullable types, tuples, streams, list comprehensions, subrange types, type constraint expressions

Part 7 Semantic density, operator overloading, custom operators

Part 8 – Constructors, declarative Archetype: the initializer body

Part 9 – Params & fluent syntax, safe navigation operator, null coalescing operators

Conceptual articles about language design and development tools:

Language Design: Complexity, Extensibility, and Intention

Reimagining the IDE

Better Tool Support for .NET

Params & Fluent Syntax

C# has a parameter modifier called params that allows you to supply additional function arguments to populate a single array parameter.

void Display(params string[] Names)
{
   
// …
}

Without the params modifier, we’d have to call it like this:

Display(new string[] { "Dan", "Josa", "Sarah" });

Because params is declared, we can do this instead:

Display("Dan", "Josa", "Sarah");

If there’s one thing you can take away from Archetype’s design, it’s that syntactic sugar is everything.  After examining my own procedural animation library (Animate.NET) to see how it could be used best in Archetype, I came to the conclusion that these params parameters can be substantial.  When they are, they create syntactic unpleasantries, especially when nested structures are involved.

Consider the following C# example.

var anim =

Animate.Wait(0.2.seconds(),

RedChip.MoveBy(0.4.seconds(), -40, 0),

RedChip.FadeIn(0.2.seconds()),

 

BlackChip.MoveBy(0.4.seconds(), 0, 40),

BlackChip.FadeOut(0.4.seconds())

)

.WhenComplete(a =>

{

MainStage.Children.Remove(RedChip);

MainStage.Children.Remove(BlackChip);

})

.Begin();


First, a quick explanation of the code.  Animate is a static class, and the Wait function returns an object called GroupAnimation that inherits from Animation.  After a 0.2 second wait, the following params list of Animation objects will execute.  RedChip and BlackChip are FrameworkElements (Silverlight/WPF objects), and animation commands such as MoveBy and FadeOut are extension methods on FrameworkElement.  Each of these animation commands returns an Animation-derived object.  The seconds() extension method on int and float types convert to TimeSpan objects.


The ultimate goal of this first Wait section of code is to define a set of animations—nested sets are possible, which form a tree of animations.  These trees can get more complicated than this, but we’ll keep the example simple for now.


Now for the criticism.  Look at the matching parentheses of the Wait function.  The normal TimeSpan parameter is listed as an equal along with the Animation parameter list, and what is being used as a complex, nested structure is holding up the closing parenthesis and dragging it down to the end of the entire list.  If only there were a cleaner way of treating this nested structure like constructor initializers (see Part 8).  These correspond, in terms of visual layout, to the attributes and the child elements of an XML node.


What else is wrong with this picture?  The .WhenComplete and .Begin functions are being invoked on the result of the previous expression.  It’s characteristic of fluent-style APIs to define functions (or extension methods) that operate on the result of the previous operation so they can be strung together into sentence-like patterns.  The dot before both WhenComplete and Begin look odd when appearing on lines by themselves, and the lambda expression would be better promoted to a proper code block.


Finally, it’s unfortunate that in declaring a new local variable, we have to indent the whole animation block this way. 
Here’s what the same code looks like in Archetype:


Animate
.Wait (0.2 seconds) -> anim

{

RedChip.MoveBy(0.4 seconds, -40, 0),

RedChip.FadeIn(0.2 seconds),

 

BlackChip.MoveBy(0.4 seconds, 0, 40),

BlackChip.FadeOut(0.4 seconds)

}

WhenComplete (a)

{

MainStage.Children.Remove(RedChip),

MainStage.Children.Remove(BlackChip)

}

Begin();


This is more like it.  Notice the declarative assignment (declaration + assignment) with –> anim on the first line, and the way the parentheses can be closed after the TimeSpan object (see Part 7 on custom operators for an explanation of the syntax “0.2 seconds”).  There’s no more need to indent the whole structure to make it line up nicely in an assignment.  The following initializer code block (in curly braces) supplies Animation object values to the params parameter in the Wait function, and the WhenComplete and Begin functions don’t require a leading dot to operate on the previous expression (Intellisense would reflect these options).

The Archetype code is much cleaner.  It’s easier to see where groups of constructs begin and end, enabling fluent-style APIs with arbitrarily-complicated nested structures to be easily constructed.  Let’s take a look at one more example with a more deeply nested structure:

 

Animate.Group –> anim

{

RedChip.MoveBy(0.4 seconds, -40, 0),

RedChip.FadeIn(0.2 seconds),

 

BlackChip.MoveBy(0.4 seconds, 0, 40),

BlackChip.FadeOut(0.4 seconds),

 

Animate.Wait (0.4 seconds)

{

Animate.CrossFade(1.5 seconds, RedChip, BlackChip),

BlackChip.MoveTo(0.2 seconds, 20, 150)

}

}


Here, a GroupAnimation is defined that contains, as one of its child Animations, another GroupAnimation (created with the Wait function).  The animation isn’t started in this case, so anim.Begin() can be called later, or anim could be composed into a larger animation somewhere.  A peek at the function headers for Group and Wait functions should make the ease and power of this design clear.

static Animate object

{

// a stream is the only parameter

Group GroupAnimation (Animations Animation* params)

{

}

 

// a stream is the last parameter, so [ list ] syntax can still be used

Wait GroupAnimation (WaitTime TimeSpan, Animations Animation* params)

{

}

}

Because the class is static, individual members are assumed to be static as well.

 

The easiest way to support this would be to allow this initializer block to be used with a params parameter that’s declared last.

         

Null Coalescing Operators

The null coalescing operator in C# allows you to compare a value to null, and to supply a default value to use in its place.  This is handy in scenarios like this:

var location = (cust.Address.City ?? "Unknown") + ", " + (cust.Address.State ?? "Unknown");

 

Chris Eargle makes a good point in his article suggesting a “null coalescing assignment operator” when making assignments such as:

cust.Address.City = cust.Address.City ?? "Unknown";

 

There should be a way to eliminate this redundancy.  By combining null coalescing with assignment, we can do this:

cust.Address.City ??= "Unknown";

Groovy’s Elvis Operator serves a similar role, but operates on a value of false in addition to null.

Safe Navigation Operator

There are many situations where we find ourselves needing to check the value of a deeply nested member, but if we access it directly without first checking whether each part of the path is null, we get a NullReferenceException.

var city = cust.Address.City;

 

If either cust or Address are null, an exception will be thrown.  To get around this problem, we have to do something like this in C#:

string city = null;

 

if (cust != null && cust.Address != null)

city = cust.Address.City;

 

The && operator is short-circuiting, which means that if the first boolean expression evaluates to false, the rest of the expression—which would produce a NullReferenceException—never gets executed.  As tedious as this is, without short circuiting operators, our error-prevention code would be even longer.

Jeff Handley wrote a clever safe navigation operator of sorts for C#, using an extension method called _ that takes a delegate (supplied as a lambda).  You can find that code hereIn his code, he does return a null value when the path short circuits.  As you can see, however, the limitations of C# cause this simple example to get confusing quickly, which you can see if we make City a non-primitive object as well:

var city = cust._(c => c.Address._(a => a.City.Name));

Groovy implements a Safe Navigation Operator in the language itself, which is cleaner:
 

var city = cust?.Address.City;

 

This is equivalent to the more verbose code above.  Archetype takes a similar approach:

var city = cust..Address.City;

Because of the .. operator in this member access expression, the type of the city variable is Option<string> (more on Option types).  If the path leading up to City is invalid (because Address is null), the value of city will be None.  This works the same as Nullable<T>, except that None means “doesn’t have a value; not even null”.

I like to think of None as the “mu constant”.  What is mu?  It’s the Japanese word that variously means “not”, “doesn’t exist”, etc., and is illustrated by the well-known Zen Buddhist koan:

A monk asked Zhaozhou Congshen, a Chinese Zen master (known as Jōshū in Japanese), "Has a dog Buddha-nature or not?" Zhaozhou answered, "Wú" (in Japanese, Mu)

The Gateless Gate, koan 1, translation by Robert Aitken

Yasutani Haku’un of the Sanbo Kyodan maintained that "the koan is not about whether a dog does or does not have a Buddha-nature because everything is Buddha-nature, and either a positive or negative answer is absurd because there is no particular thing called Buddha-nature.

In other words, Mu has often been used to mean “I disagree with the presuppositions of the question.”

There are a few basic patterns around options, nullable objects, and safe navigation that occur frequently, so I’ll outline them here with examples:

// if Address is null, this evaluates to false

if (cust..Address.City == "Milwaukee")

WorkHarder();

 

// if City is None because Address is null, set to "Address Missing"; otherwise, get the city text

var city = cust..Address.City ?! "Address Missing";

 

// if City is Some<string> and City == null, set to empty string

var city = cust..Address.City ?? string.Empty;

 

// if Address is null (City is None), set to "Address Not Found";

// but if City == null, set to empty string

var city = cust..Address.City ?! "Address Not Found" ?? string.Empty;

 

// if Address points to an object, leave it alone; otherwise, create a new object

cust.Address ??= new Address(City="Milwaukee");

 

// an assertion

cust..Address ?! new Exception("Address missing");

 

// set the city if possible, throw a specific exception if not

var city = cust..Address.City ?! new Exception("Address missing");

 

Summary

By now it should be obvious that Archetype aims to liberate the developer from the constraints and inefficiencies of ordinary programming languages.  It is designed with modern practices in mind such as fluent-style development and declarative object graph construction.

This article wraps up the material started in Part 8 on declarative programming in Archetype.  In addition, I introduced the safe navigation and null coelescing operators.  These are simple but powerful language elements for cleanly and succinctly specifying common idioms that come up in daily coding.

Posted in Animation, Archetype Language, Composability, Design Patterns, Fluent API, Language Innovation | 4 Comments »

The Archetype Language (Part 8)

Posted by Dan Vanderboom on October 1, 2010

Overview

This is part of a continuing series of articles about a new .NET language under development called Archetype.  Archetype is a C-style (curly brace) functional, object-oriented (class-based), metaprogramming-capable language with features and syntax borrowed from many languages, as well as some new constructs.  A major design goal is to succinctly and elegantly implement common patterns that normally require a lot of boilerplate code which can be difficult, error-prone, or just plain onerous to write.

You can follow the news and progress on the Archetype compiler on twitter @archetypelang.

Links to the individual articles:

Part 1 – Properties and fields, function syntax, the me keyword

Part 2 – Start function, named and anonymous delegates, delegate duck typing, bindable properties, composite bindings, binding expressions, namespace imports, string concatenation

Part 3 – Exception handling, local variable definition, namespace imports, aliases, iteration (loop, fork-join, while, unless), calling functions and delegates asynchronously, messages

Part 4 – Conditional selection (if), pattern matching, regular expression literals, agents, classes and traits

Part 5 – Type extensions, custom control structures

Part 6 – If expressions, enumerations, nullable types, tuples, streams, list comprehensions, subrange types, type constraint expressions

Part 7 Semantic density, operator overloading, custom operators

Part 8 – Constructors, declarative Archetype: the initializer body

Part 9 – Params & fluent syntax, safe navigation operator, null coalescing operators

Conceptual articles about language design and development tools:

Language Design: Complexity, Extensibility, and Intention

Reimagining the IDE

Better Tool Support for .NET

Constructors

A constructor in Archetype is a recommended, predefined prototype for instantiating an object correctly.

 

The default parameterless constructor is defined implicitly (it’s defined even if it isn’t written), even if other constructors are defined explicitly. This last part is unlike other languages that hide the parameterless constructor when others are defined.  This will make classes with these default constructors common in Archetype, to more easily support behaviors like serialization and dynamic construction.  When it needs to be hidden, it can be defined with reduced visibility, such as private.

 

A constructor is defined with the name new, consistent with how it’s invoked.

 

Let’s start with a very basic class, and build up to more complicated examples.

 

Customer object

{

FirstName string;

LastName string;

}

 

Despite the lack of an explicit constructor, it’s important for Archetype to define constructs that are useful in their default configurations.  You couldn’t get more basic that the Customer class above.  If we want to define the constructor explicitly, we can do so.

 

Customer object

{

FirstName string;

LastName string;

 

new ()

{

// do nothing

}

}

 

Instantiating a Customer object is easy. With the parameterless constructor, parentheses are optional.

 

var dilbert = new Customer;

 

Archetype, like C#, supports constructor initializers:

 

var dilbert = new Customer

{

FirstName = "Dilbert",

LastName = "Smith"

};

 

When you have few parameters and want to compress this call to a single line, the curly braces end up feeling a little too much (too formal?).

 

var dilbert = new Customer { FirstName = "Dilbert", LastName = "Smith" };

 

Archetype supports passing these assignment statements as final arguments of the constructor parameter list, like this:

 

var dilbert = new Customer(FirstName = "Dilbert", LastName = "Smith");

 

As a result, there isn’t much need to define constructors that only set fields and properties to the value of constructor parameters.  Because Archetype has this mechanism for fluidly initializing objects at construction, the only time constructors really need to be defined is when construction of the object is complicated or unintuitive, in which case a supplied construction pattern is a sure way to make sure it’s done correctly.  Our Customer example doesn’t meet those criteria, but if it did, this is one way we could write it:

 

Customer object

{

FirstName string;

LastName string;

 

new (FirstName string, LastName string)

{

this.FirstName = FirstName;

this.LastName = LastName;

}

}

 

To avoid having to qualify FirstName with the this keyword, many people prefer naming their parameters with the first character lower-cased.  That’s an unfortunate compromise.  When viewing at least the public members of a type, in a sense you’re creating an outward-facing API, and I think Pascal casing more naturally respects English grammar, not downplaying the signficance of the most-important first word in an identifier by lower-casing it to get around some unfortunate syntax limitation.

 

But instead of taking sides in a naming convention war, we can solve the problem in the language and remove the need to make any compromise.

 

new (FirstName string, LastName string)

{

set FirstName, LastName;

}

 

This lets us set individual properties named the same as constructor parameters.  It’s flexible enough to set some and consume other parameters differently, but when you want to set all parameters with matching member names, you can use the shortcut set all.  If that’s all the constructor needs to do, we can do away with the curly braces:

 

new (FirstName string, LastName string) set all;

 

If our Customer class contained a BirthDate property, we could use this constructor and pass in an initializer statement as a final parameter.

 

var dilbert = new Customer("Dilbert", "Smith", BirthDate = DateTime.Parse("7/4/1970");

 

This works with multiple initializers.  Alternatively, we could use an initializer body after the parameter list:

 

var dilbert = new Customer("Dilbert", "Smith")

{

BirthDate = DateTime.Parse("7/4/1970")

};

 

Note how we have two places to supply data to a new object, if needed: the parameter list for simple, short values, and the initializer body for much larger assignments.


Another common construction pattern is for one or more constructors to call another constructor with a default set of properties.  Typically the constructor with the full list of parameters performs the actual work, while the shorter constructors call into the main one, passing in some default values and passing the others through.

 

new (EvaluateFunc sFunc<T>) new(null, null, EvaluateFunc);

 

new (BaseObject object, EvaluateFuncsFunc<T>) new(null, BaseObject, EvaluateFunc);

 

new (Name string, BaseObject object, EvaluateFunc Func<T>)

{

set all;

 

// do all the real work…

// …

}

 

Declarative Archetype: The Initializer Body

 

The initializer body mentioned above has a special structure in Archetype.  Member assignment statements can appear side-by-side with value expressions that are processed by a special function called value.  This can be used, among other things, to add items to a collection.  It’s best to see in an example:

 

var dilbert = new Customer

{

FirstName = "Dilbert",

LastName = "Smith",

BirthDate = DateTime.Parse("7/4/1970"),

 

new SalesOrder(OrderCode = "ORD012940"),

 

new SalesOrder

{

OrderCode = "ORD012941",

 

new SalesOrderLine(ItemCode = "S0139", Quantity = 3),

new SalesOrderLine(ItemCode = "S0142", Quantity = 1)

}

};

 

The first three lines of the initializer set members with assignment statements.  The next expression (new SalesOrder …) in the list creates an object, but there’s no assignment.  It returns a value, but where does it go?  Take a look at the value functions below for the answer:

 

Customer object

{

FirstName string;

LastName string;

Orders SalesOrder* = new;

Invoices Invoice* = new;

 

// formatted inline

value (Order SalesOrder) Orders += Order;

 

// formatted with full code block

value (Invoice Invoice)

{

Invoices += Invoice;

}

}

 

A Customer has several collections of things–Orders and Invoices here–and because there are two value functions in the class, any expressions of type SalesOrder or Invoice will be evaluated and their values passed to the appropriate value function. Expressions of other types will trigger a compile-time error.

 

The += and -= operators haven’t been shown before.  Their use is a very natural fit for stream and list types.  The += operator appends an object to a stream, and -= removes the first occurrence of that object.

 

This simple addition of a value function in types (classes and structs) gives Archetype the ability to represent hierarchical structures in a clean, declarative way.  Sure it’s always been possible to format expressions similarly, but the syntactic trappings of imperative languages have made this difficult and unattractive at best, and in most real-world cases impractical.

 

When I experimented in creating a Future class, I came up with a pattern in C# to nest structures in a tree for large future expressions, but the need to match parentheses gets in the way and consumes too much attention that’s better focused on the logic itself:

 

Future<string> FuturePi = null, FutureOmega = null, FutureConcat = null, FutureParen = null;

 

var result = new Future<string>("bracket",
    () => Bracket(FutureParen),
    (FutureParen =
new Future<string>("parenthesize",
        () => Parenthesize(FutureConcat),
        (FutureConcat =
new Future<String>("concat",
            () => FuturePi +
" < " + FutureOmega,
                (FuturePi =
new Future<string>("pi", () => CalculatePi(10))),
                (FutureOmega =
new Future<string>("omega", () => CalculateOmega()))
            ))
        ))
    );

 

The difference finally occurred to me between the need to set few simple members and the definition of larger, more structured content–including nested structures–that begged for a way to supply them without carrying the end parenthesis down multiple lines or letting them build up into parentheses knots that must be carefully counted.  One gets to fidgeting with where to put them, and sometimes there’s no good answer to that.


Another feature we need to make this declarative notation ability robust is inline variable declaration and assignment.  Notice in the last example how several intermediary structures have variable names defined for them ahead of time, outside the expression. Writing that Future code, I felt it was unfortunate these variables couldn’t be defined inline as part of the expression.  Doing so would allow us to define any kind of structure we might see in XML or JSON, such as this XAML UI code.

 

new Canvas -> LayoutRoot

{

Height = Auto,

Width = Auto,

 

new StackPanel -> sp

{

Orientation = Vertical,

Height = 150,

Width = Auto,

 

Canvas.Top = 10,

Canvas.Left = 20,

 

with Canvas

{

Top = 10,

Left = 20,

},

 

with Canvas { Top = 10, Left = 20 },

 

Loaded += (sender, e)

{

Debug.WriteLine("StackPanel sp.Loaded running");

sp.ResizeTo(0.5 seconds, Auto, 200).Begin();

},

 

LayoutUpdated += HandleLayoutUpdated,

 

new TextBlock

{

FontSize = 18,

Text = "Title"

},

new TextBlock { Text = "Paragraph 1" },

new TextBlock { Text = "Paragraph 2" },

new TextBlock(Text = "Paragraph 3")

}

};

 

A few notes are needed here:

 

·   Wow, this looks a lot like XAML, but much friendlier to developers who have to actually read and edit it!  Yes, good observation.

·  Unlike XAML, every identifier here works with the all-important Rename refactoring, go to definition, find all references to, etc. This is great for reducing the amount of work to find relationships among things and manually update related files.

·  Also unlike XAML, code for event handlers can be defined here. I’m not saying you should cram all of your event handler logic here, but it could come in quite handy at times and I can’t see any reason to disable it. 

·  The with token is a custom operator (see Part 7) that provides access to attached properties through an initializer body. Custom extensions allow you to access these properties with a natural member-access style.

·  It hasn’t been possible to use generic classes in XAML. Specifying UI in Archetype, this would be trivial, and I suspect they could be used to good effect in many ways. Of course, in doing this you’d lose support for the designers in VS and Blend, which would be awfui.

·  Auto is simply an alias for double.NaN.

·  The -> custom operator in these expressions defines a variable and sets it to the value of the new object. The order of execution is:

1. Evaluate constructor parameters, if any are supplied.

2. Assign the object to the variable defined with ->, if supplied.

3. Set any fields or properties with assignment statements.

4. Evaluate value expressions, if supplied, and call the class’s value function with each one, if a value function has been defined.

5. Invoke any matching value function defined in class extensions.

 

By following this design, the example above can be translated into this C# code by the Archetype compiler:

 

var LayoutRoot = new Canvas()

{

Height = double.NaN,

Width = double.NaN

};

 

var sp = new StackPanel()

{

Orientation = Orientation.Vertical,

Height = 150.0,

Width = double.NaN

};

 

LayoutRoot.Children.Add(sp);

 

sp.SetValue(Canvas.TopProperty, 10.0);

sp.SetValue(Canvas.LeftProperty, 20.0);

 

sp.Loaded += (sender, e)

{

Debug.WriteLine("StackPanel sp.Loaded running");

sp.ResizeTo(0.5.seconds(), double.NaN, 200.0).Begin();

},

 

sp.LayoutUpdated += HandleLayoutUpdated;

 

sp.Children.Add(new TextBlock() { FontSize = 18, Text = "Title"});

sp.Children.Add(new TextBlock() { Text = "Paragraph 1"});

sp.Children.Add(new TextBlock() { Text = "Paragraph 2"});

sp.Children.Add(new TextBlock() { Text = "Paragraph 3"});

 

var VisualTree = LayoutRoot;

 

Compare the two approaches. The C# code is a typical example of imperative structure building, while the Archetype code is arguably as declarative as XAML, and with many advantages over XAML for developers.


Going back to the Future example, we could rewrite this in Archetype a few different ways.  I’ll present two.  In the first one, value functions are used to receive the future’s evaluation function as well as any Future objects the expression depends on.


new
Future<string>("bracket") -> result

{

() => Bracket(FutureParen),
new Future<string>("parenthesize") -> FutureParen

{

() => Parenthesize(FutureConcat),
new Future<string>("concat") -> FutureConcat

{

() => FuturePi + " < " + FutureOmega,
new Future<string>("pi") -> FuturePi

{

() => CalculatePi(10)

},

new Future<string>("omega") -> FutureOmega

{

() => CalculateOmega()

}

}

}

}


The shorter approach passes an evaluation delegate in as a parameter.


new
Future<string>(() => Bracket(FutureParen)) -> result

{

new Future<string>(() => Parenthesize(FutureConcat)) -> FutureParen

{

new Future<string>(() => FuturePi + " < " + FutureOmega) -> FutureConcat

{

new Future<string>(() => CalculatePi(10)) -> FuturePi,

new Future<string>(() => CalculateOmega()) -> FutureOmega

}

}

}

 

The name string parameter is missing from the last example.  This was only for use during debugging.  Now what we have is a very direct description of futures that are dependent on other futures in a dependency graph.

Summary

Object construction is a crucial part of an object-oriented language, and Archetype is advanced with its options for constructing arbitrary object graphs and initializing even complicated state in a single expression.  These fluent declarative syntax features are ideal for representing structures such as XAML UI, state machines, dependency graphs, and much more.

XAML is a language.  The question this work has me asking is: do we really need a separate language if our general purpose language supports highly declarative syntax? It’s a provocative question without an easy answer, but it seems clear that many DSLs could emerge within a language that so richly supports composition.

With the ability to define arbitrarily complex structures in code—from declarative object graphs to rich functional expressions—it’s hard to think of a situation that would be too difficult to model and build an API or application around.

Posted in Archetype Language, Data Structures, Design Patterns, Language Innovation, Silverlight, User Interface Design, WPF | 2 Comments »

The Archetype Language (Part 7)

Posted by Dan Vanderboom on September 27, 2010

Overview

This is part of a continuing series of articles about a new .NET language under development called Archetype.  Archetype is a C-style (curly brace) functional, object-oriented (class-based), metaprogramming-capable language with features and syntax borrowed from many languages, as well as some new constructs.  A major design goal is to succinctly and elegantly implement common patterns that normally require a lot of boilerplate code which can be difficult, error-prone, or just plain onerous to write.

You can follow the news and progress on the Archetype compiler on twitter @archetypelang.

Links to the individual articles:

Part 1 – Properties and fields, function syntax, the me keyword

Part 2 – Start function, named and anonymous delegates, delegate duck typing, bindable properties, composite bindings, binding expressions, namespace imports, string concatenation

Part 3 – Exception handling, local variable definition, namespace imports, aliases, iteration (loop, fork-join, while, unless), calling functions and delegates asynchronously, messages

Part 4 – Conditional selection (if), pattern matching, regular expression literals, agents, classes and traits

Part 5 – Type extensions, custom control structures

Part 6 – If expressions, enumerations, nullable types, tuples, streams, list comprehensions, subrange types, type constraint expressions

Part 7 Semantic density, operator overloading, custom operators

Part 8 – Constructors, declarative Archetype: the initializer body

Part 9 – Params & fluent syntax, safe navigation operator, null coalescing operators

Conceptual articles about language design and development tools:

Language Design: Complexity, Extensibility, and Intention

Reimagining the IDE

Better Tool Support for .NET

Semantic Density

As an avid reader growing up, I noticed that my knowledge and understanding of a topic grew more easily the faster I read.  Instead of going through a chapter every day or two, which puts weeks or months between the front and back covers, I devoured 200-300 pages in a night, getting through the largest books in a couple days.  And in reading multiple books on a subject back-to-back, it was easier to find relationships and tie together concepts for things that were still so fresh in my memory.

In my study of linguistics, I learned that legends like Noam Chomsky could learn hundreds of langauges; the previous librarian at the Vatican could read 97.  Bodmer’s excellent book The Loom of Language attempts to teach 10 languages at once, and it seems that the more languages you learn, the easier it is to pick up others.

What these examples have in common is semantic density.  It might seem from what I’ve said that this would be like drinking from a firehose which only the most gifted could endure, but I would argue that intensely-focused learning puts our minds in a highly alert and receptive condition.  In such a state, being able to draw more connections between statements and ideas, we are better able to comprehend the whole in a holistic, intuitive way.

Code Example

Semantic density is important in code, too.  With a pattern like INotifyPropertyChanged, formatted as I have it below, it’s 12 lines of code, 13 if you separate your fields and properties with a blank line, but in the ballpark of a dozen lines of code.  (This is additional explanation for a feature described in part 2 of this series.)

1:   string _Display;
2:   public string Display
3:   {
4:       get { return _Display; }
5:      
set
6:      
{
7:           _Display = value;
8: 
9:           if (PropertyChanged != null)
10:              PropertyChanged(this, new PropertyChangedEventArgs("Display"));
11:      }
12:  }

Does the ability to inform external code of changes seem like it should take a dozen lines to get there?  This can be somewhat compressed by defining a SetProperty method:

1:   string _Display;
2:   public string Display
3:   {
4:       get { return _Display; }
5:       set { SetProperty("Display", ref _Display, value); }
6:   }


This chops the line count in half, bringing it down to six lines–seven if you include a space above or below to separate it from other members.  On my monitor, that means I can see about eight property definitions at a time.  Now that’s usually enough, but I’ve written a few custom controls  that have upwards of 30 properties.  For a new pair of eyes, getting the gist of that class is going to involve a lot of scrolling, never seeing more than a small slice at a time of a much large picture.  The frame of time I mentioned earlier in regard to studying a subject is analogous here to the frame of space.  Seeing 20% of a class at any time lends itself to faster grokking than seeing a 2% sliver at a time.  Our minds, marvelous as they are, do have limits.  Lowering semantic density, such as by spreading meaning over large distances or time spans, makes us work harder to accomplish the same task, trying to put all the pieces together, and the differences are often dramatic.

Just as nouns can be modified by adjectives in natural languages, types in Archetype support user-defined type modifiers.  By defining a new type modifier called bindable to encapsulate the INotifyPropertyChanged pattern, we can collapse the above example into a single line:

1:   Age bindable int;

 

I have no problem stacking these properties right on top of one another.  Although expressed in a highly dense form, it’s actually easier to understand at a glance in these three simple tokens than in the half-dozen lines above, which beg for interpretation to assemble their meaning.  Even if we have 30 of these, they’d all fit on one screen, and the purpose of the class as a whole is quickly gathered.

 

One thing I noticed in developing the animation library Animate.NET is how much code I saved: not having to worry about the details of storyboard creation, key frames, and so on.  It allows you to get right to the point of stating your intention.  Often a library like this is enough, but once in a while language extensibility is a much better approach; and when it is, not having the option can be painful and time consuming.

 

Custom Operators & Operator Overloading

As in most languages, Archetype supports two forms of syntax for operations: functions and operators.  Functions are invoked by including a pair of parentheses after their name that contain any arguments to pass in, whereas operators appear adjacent to or between sub-expressions. 

In C#, some operators are available for overloading.  Archetype supports these operator overloads by using the same names for them.  This allows Archetype to use operators defined in C# and to expose supported operators to C# consumers.

However, Archetype goes one step further and allows you to define custom operators.  There are three basic kinds of custom operator:

  1. unary prefix
  2. unary suffix
  3. binary

If we wanted an easy way to duplicate strings in C#, we might define an extension method called Dup, but in Archetype we also have this option:

// "ABC" dup 3 == "ABCABCABC"

binary dup string (left string, right int)

return string.Repeat(left, right)


The expression parser sees “ABC”, identifies it as a string value, and then looks at the next token.  If the dot operator were found (.), it would look for a member of string or an extension member on string, but because the next token isn’t the dot operator, it looks up the token in the operator table.  An operator called dup is defined with a left string argument and a right int argument, matching the expression.  If the operator were more complicated, it would have a curly-brace code block, but because it’s a single return statement, that’s optional.

Archetype operators aren’t limited to letters, though.  We can also use symbols (but not numbers) in our operator names.  Here’s a “long forward arrow” (compiled with name DashDashGreaterThan) that allows us to write a single function parameter before the function name itself:

// "Hey" –> Console.WriteLine;

binary<T> –> void (left T, right Action<T>)

right(left);


Note that the generic type parameter is attached to the binary keyword.  I arrived at this placement through much experimentation.  Names like –><T> are difficult to read and can be trickier to parse.

There is a special binary operator called adjacent which you can think of as an “invisible operator” capable of inserting an operation between two sub-expressions.  In the following example, two adjacent strings are interpreted as a concatenation of the two.

// "123" "45" == "12345"

binary adjacent string (left string, right string)

return left + right;

With custom operators, what was originally part of the language can now be defined in a library instead.  This greatly simplifies the language.  Just as methods can be shadowed to override them, so too will some ability be needed in Archetype to block or override operators that would otherwise be imported along with a namespace.

The next operator we’ll look at is the unary suffix.  The example consists of units of time: minutes and seconds.

// 12 minutes == TimeSpan.FromMinutes(12)

unary suffix minutes TimeSpan (short, int)

return TimeSpan.FromMinutes((int)value);

 

// 3 seconds == TimeSpan.FromSeconds(3)

unary suffix seconds TimeSpan (short, int)

return TimeSpan.FromSeconds((int)value);


With support for extension properties, we could have also said 12.minutes or 3.seconds, which is already better than C#’s 12.minutes() and 3.seconds(), but by defining these tokens as unary suffix operators, we can eliminate even the dot operator and make it that much more fluent and natural to type (without losing any syntactic precision).  Notice how a list of types is provided instead of a parameter list.  Unary operators by definition have only a single argument, but we often want them to operate on several different types.

Here’s a floating point operator for seconds.

// 2.5 seconds == (2 seconds).Add(0.5 * 1000 milliseconds)

// WholeNumber and Fraction are extension properties on float, double, and decimal

unary suffix seconds TimeSpan (float, double, decimal)

return value.WholeNumber seconds + value.Fraction * 1000 milliseconds;

We can use the adjacent operator on TimeSpans the same that we did for strings above.

// 4 minutes 10 seconds == (4 minutes).Add(10 seconds)

binary adjacent TimeSpan (left TimeSpan, right TimeSpan)

return left.Add(right);

Now let’s combine the use of a few of these operators into a single example.

alias min = minutes;

alias s = seconds;

var later = DateTime.Now + 2 min 15 s;

// Schedule is an extension method on DateTime

later.Schedule

{

// schedule this to run later

}

We can also define our DateTime without assigning its value to a variable.

(DateTime.Now + 10 seconds).Schedule

{

}

 

(DateTime.Now + 10 seconds).Schedule (Repeat=10 seconds)

{

}

Repeat is an optional parameter of Schedule, defaulting to TimeSpan.Zero, meaning “don’t repeat”.

Additional Notes

When more than one operator is valid in a given position, the most specific operator (in terms of its parameter types) is used.  If there’s any ambiguity or overlap remaining, a compiler error is issued.

Unary operators will take precedence over binary operators, but it hasn’t been determined yet what precedence either one will actually have in relation to all of the other operators, or whether this will be specified in the operator definition.

Because of this design for custom operators, I’ve been able to remove things from the language itself and include them as operator definitions in a library.

Summary

This article provided some deeper explanation into previously covered material and introduced the syntax for Archetype’s very powerful custom operator declaration syntax.  The next article will cover some special operators built into Archetype, and property path syntax which is something I came up with a while back to safely reference identifiers that would be impervious to both refactoring and obfuscation.

I’m curious to read your feedback on custom operators in particular, so keep the great comments coming!

Posted in Archetype Language, Composability, Data Structures, Design Patterns, Language Innovation | 1 Comment »

Language Design: Complexity, Extensibility, and Intention

Posted by Dan Vanderboom on June 14, 2010

Introduction

The object-oriented approach to software is great, and that greatness draws from the power of extensibility.  That we can create our own types, our own abstractions, has opened up worlds of possibilities.  System design is largely focused on this element of development: observing and repeating object-oriented patterns, analyzing their qualities, and adding to our mental toolbox the ones that serve us best.  We also focus on collecting libraries and controls because they encapsulate the patterns we need.

This article explores computer languages as a human-machine interface, the purpose and efficacy of languages, complexity of syntactic structure, and the connection between human and computer languages.  The Archetype project is an on-going effort to incorporate these ideas into language design.  In the same way that some furniture is designed ergonomically, Archetype is an attempt to design a powerful programming language with an ergonomic focus; in other words, with the human element always in mind.

Programming Language as Human-Machine Interface

A programming language is the interface between the human mind and executable code.  The point isn’t to turn human programmers into pure mathematical or machine thinkers, but to leverage the talent that people are born with to manipulate abstract symbols in language.  There is an elite class of computer language experts who have trained themselves to think in terms of purely functional approaches, low-level assembly instructions, or regular, monotonous expression structures—and this is necessary for researchers pushing themselves to understand ever more—but for the every day developer, a more practical approach is required.

Archetype is a series of experiments to build the perfect bridge between the human mind and synthetic computation.  As such, it is based as much as possible on a small core of extensible syntax and maintains a uniformity of expression within each facet of syntax that the human mind can easily keep separate.  At the same time, it honors syntactic variety and is being designed to shift us closer to a balance where all of the elements, blocks, clauses and operation types in a language can be extended or modified equally.  These represent the two most important design tenets of Archetype: the intuitive, natural connection to the human mind, and the maximization of its expressive power.

These forces often seem at odds with each other—at first glance seemingly impossible to resolve—and yet experience has shown that the languages we use are limited in ways we’re often surprised by, indicating that processes such as analogical extension are at work in our minds but not fully leveraged by those languages.

Syntactic Complexity & Extensibility

Most of a programming language’s syntax is highly static, and just a few areas (such as types, members, and sometimes operators) can be extended.  Lisp is the most famous example of a highly extensible language with support for macros which allow the developer to manipulate code as if it were data, and to extend the language to encode data in the form of state machines.  The highly regular, parenthesized syntax is very simple to parse and therefore to extend… so long as you don’t deviate from the parenthesized form.  Therefore Lisp gets away with powerful extensibility at the cost of artificially limiting its structural syntax.

In Lisp we write (+ 4 5) to add two numbers, or (foo 1 2) to call a function with two parameters.  Very uniform.  In C we write 4 + 5 because the infix operator is what we grew up seeing in school, and we vary the syntax for calling the function foo(1, 2) to provide visual cues to the viewer’s brain that the function is qualitatively something different from a basic math operation, and that its name is somehow different from its parameters.

Think about syntax features as visual manifestations of the abstract logical concepts that provide the foundation for all algorithmic expression.  A rich set of fundamental operations can be obscured by a monotony of syntax or confused by a poorly chosen syntactic style.  Archetype involves a lot of research in finding the best features across many existing languages, and exploring the limits, benefits, problems, and other details of each feature and syntactic representation of it.

Syntactic complexity provides greater flexibility, and wider channels with which to convey intent.  This is why people color code file folders and add graphic icons to public signage.  More cues enable faster recognition.  It’s possible to push complexity too far, of course, but we often underestimate what our minds are capable of when augmented by a system of external cues which is carefully designed and supported by good tools.

Imagine if your natural spoken language followed such simple and regular rules as Lisp: although everyone would learn to read and write easily, conversation would be monotonous.  Extend this to semantics, for example with a constructed spoken language like Lojban which is logically pure and provably unambiguous, and it becomes obvious that our human minds aren’t well suited to communicating this way.

Now consider a language like C with its 15 levels of operator precedence which were designed to match programmers’ expectations (although the authors admitted to getting some of this “wrong”, which further proves the point).  This language has given rise to very popular derivatives (C++, C#, Java) and are all easily learned, despite their syntactic complexity.

Natural languages and old world cities have grown with civilization organically, creating winding roads and wonderful linguistic variation.  These complicated structures have been etched into our collective unconscious, stirring within us and giving rise to awareness, thought, and creativity.  Although computers are excellent at processing regular, predictable patterns, it’s the complex interplay of external forces and inner voices that we’re most comfortable with.

Risk, Challenge & Opportunity

There are always trade-offs.  By focusing almost all extensibility in one or two small parts of a language, semantic analysis and code improvement optimizations are easier to develop and faster to execute.  Making other syntactical constructs extensible, if one isn’t careful, can create complexity that quickly spirals out of control, resulting in unverifiable, unpredictable and unsafe logic.

The way this is being managed in Archetype so far isn’t to allow any piece of the syntax tree to be modified, but rather to design regions of syntax with extensibility points built-in.  Outputting C# code as an intermediary (for now) lays a lot of burden on the C# compiler to ensure safety.  It’s also possible to mitigate more computationally expensive semantic analysis and code generation by taking advantage of both multicore and cloud-based processing.  What helps keep things in check is that potential extensibility points are being considered in the context of specific code scenarios and desired outcomes, based on over 25 years of real-world experience, not a disconnected sense of language purity or design ideals.

Creating a language that caters to the irregular texture of thought, while supporting a system of extensions that are both useful and safe, is not a trivial undertaking, but at the same time holds the greatest potential.  The more that computers can accommodate people instead of forcing people to make the effort to cater to machines, the better.  At least to the extent that it enables us to specify our designs unambiguously, which is somewhat unnatural for the human mind and will always require some training.

Summary

So much of the code we write is driven by a set of rituals that, while they achieve their purpose, often beg to be abstracted further away.  Even when good object models exist, they often require intricate or tedious participation to apply (see INotifyPropertyChanged).  Having the ability to incorporate the most common and solid of those patterns into language syntax (or extensions which appear to modify the language) is the ultimate mechanism for abstraction, and goes furthest in minimizing development effort.  By obviating the need to write convoluted yet routine boilerplate code, Archetype aims to filter out the noise and bring one’s intent more clearly into focus.

Posted in Archetype Language, Composability, Design Patterns, Language Extensions, Language Innovation, Linguistics, Metaprogramming, Object Oriented Design, Software Architecture | 2 Comments »

Animate.NET: Fluent Animation Library for Silverlight & WPF

Posted by Dan Vanderboom on December 31, 2009

Overview

The basic idea—in Silverlight and WPF—that an animation is just a change in some DependencyProperty over time is simple and powerful.  However, at that level of detail, the API for defining and managing complex animations involves writing a ton of code.  There are code-less animations, of course, such as those created in the Visual State Manager, but when you want to perform really dynamic animations, state-based animations can become impractical or outright impossible.

In response to this, I’ve published my fluent-style code-based animation library for Silverlight and WPF on CodePlex at http://animatedotnet.codeplex.com.  This is an API for making code-based animations intuitive and simple without having to write dozens or even hundreds of lines of code to create and configure storyboards, keyframes, perform repetitive math to calculate alignment, rotation, and other low-level details that distract one from the original purpose of the animation.  In one example, I counted over 120 lines of standard storyboard code, and with the abstractions and fluent API I’ve come up with, reduced that down to half a dozen lines of beautiful, pure intent.  As a result, it’s much more readable and faster to write.

I was initially inspired by Nigel Sampson from his blog article on building a Silverlight animation framework.  The code on his site was a good first step in creating higher-level abstractions, going above DoubleAnimation to define PositionAnimation and RotationAnimation, and I decided to build on top of that, adding other abstractions as well as a fluent-style API in the form of extension methods that hide even those classes.

Concepts

All Animate.NET animations derive from the Animation class which tracks the UI element being modified, the duration of animation, whether it has completed, and fires an event when the animation completes.  It manages building and executing the Storyboard object so you don’t have to.

Subclasses of Animation currently include OpacityAnimation, PositionAnimation, RotationAnimation, SizeAnimation, TransformAnimation, and GroupAnimation.  TransformAnimation is the parent class of RotateAnimation, and in the future ScaleAnimation and TranslateAnimation may also be included.

GroupAnimation is special because it allows you to combine multiple animations.  These groups can be nested and each group can include a wait time before starting (to stagger animations).

The Animate static class includes all of the extension methods that make up the fluent API, and the intention is for this to be the master class for building complex group animations.  Most of these methods come in pairs: you can RotateTo a specific angle or RotateBy relative to your current angle; MoveTo a specific location or MoveBy relative to your current position, etc.

Here’s the list so far:

  • Group and Wait
  • Fade, FadeIn, FadeOut, and CrossFade
  • RotateTo and RotateBy
  • ResizeTo and ResizeBy
  • MoveTo and MoveBy

Examples

Animate.NET can best be understood and appreciated with examples.

Basic Animations

Let’s say you want to resize an element to a new size.  Normally you’d need a storyboard and two DoubleAnimations: one for x and another for y, and for each you’d need to set several properties.  With Animate.NET, you can define and execute your animation beginning with a reference to the element you want to animate:

var rect = new Rectangle()
{
    Height = 250,
    Width = 350,
    Fill = new SolidColorBrush(Colors.Blue)
};
MainStage.Children.Add(rect);
rect.SetPosition(50, 50);

rect.ResizeTo(150, 150, 1.5.seconds()).Begin();


Only a single line of code, the last one, is needed to resize the rect element.

Note the call to Begin.  Without this, the ResizeTo (and all other fluent API calls) will return an object that derives from Animation but will not run.  We can, if needed, obtain a reference to the animation and begin the animation separately, like this:

var anim = rect.ResizeTo(150, 150, 1.5.seconds());
anim.Begin();


This allows us to compose animations into groups and manipulate animations after they’ve started, and is very similar to how LINQ queries are composed and later executed.

You’ll also notice the use of several other extension methods:

  • SetPosition – sets Left and Top currently.  In future versions, you’ll be able to define a registration point for positioning that may be located elsewhere, such as the center of the element.
  • seconds() – along with milliseconds, minutes, etc., allows you to specify a TimeSpan object more fluently.  I saw this in some Ruby code and loved it.  If only the C# team would implement extension properties, it would look even cleaner (eliminate the need for parentheses).
  • Center() and GetCenter() – centers an element immediately, and gets a Point object representing the center of the object respectively.  Not used in these examples, but worth mentioning.

Group Animations

Next I’ll show an example of a group animation using the Animate class’s Group method:

Animate.Group(
    rect.RotateBy(rect.GetCenter(), -90, 1.seconds()),
    rect.FadeOut(1.seconds())
    ).Begin();


This group animation contains two child animations: one to rotate the rectangle 90 degrees counterclockwise, and the other to fade the rectangle out (make it completely transparent).  The method takes a params array, so you can include as many animations as you like.

Because the animations listed are peers in the group, they begin running at the same time.  Often you will want to stagger animations, however.  You can accomplish this with the Wait method, which is the Group method in disguise (it simply includes an additional TimeSpan parameter).

Animate.Group(
    rect.RotateBy(rect.GetCenter(), -90, 1.5.seconds()),
    rect.FadeIn(0.5.seconds()),
    Animate.Wait(1.seconds(),
        rect.FadeOut(0.5.seconds())
        )
    ).Begin();


This animation rotates the rectangle for 1.5 seconds.  During the first 0.5 seconds, it fades in; during the last 0.5 seconds, it fades out.  Only one element, rect, was used in this example, but any number of UI elements can participate.

Animations can be nested and staggered to arbitrary complexity.  Because all animations derive from the Animation class, you can write properties or methods to encapsulate group animations, and assemble them programmatically before executing them.  Because all the ceremony of storyboards and keyframes is abstracted away, it’s very easy to see what’s happening in this code in terms of the end result.

Method Chaining

One of the benefits of a fluent API is the ability to chain together methods that modify a primary object.  For example, the Animation class defines a WhenComplete method that can be used to respond to the completion of an animation.  In the samples project on CodePlex, I create new UI objects at the beginning of each animation, and remove them afterward:

rect.ResizeTo(150, 150, 1.5.seconds())
    .WhenComplete(a =>
    {
        Thread.Sleep(2000);
        MainStage.Children.Remove(rect);
    })
    .Begin();


I pause for a couple seconds after displaying the final result before removing that object from its container.

Extension methods will be used more in the future for this library.  Uses will include modifying the animation to apply easing functions, responding to collision detection (by stopping or reversing), and so on.  This might end up looking something like:

rect.ResizeTo(150, 150, 1.5.seconds())
    .Ease(EasingFunction.Cubic(0.5))
    .StopIf(a => Animate.Collision(GetCollisionObjects()))
    .Begin();
 

Feedback and Future Direction

I’m releasing this as a very early experiment, and I’m interested in your feedback on the library and its API.

What kind of functionality would you like to see added?  Do the method names and syntax feel right?  What major, common animation scenarios have I omitted?  What other kinds of samples would you like to see?

Download the library and samples and give it a try!

Posted in Algorithms, Animation, Composability, Data Structures, Design Patterns, Dynamic Programming, Fluent API, Silverlight, WPF | 5 Comments »

AssignAsync Extension Method for ADO.NET Data Services

Posted by Dan Vanderboom on November 10, 2009

ADO.NET Data Services is a rapidly evolving set of tools that provides data access to remote clients through a set of REST-based services.  The Data Services Client Library for .NET performs the magic of translating your Linq queries to URLs and passing them to the data service back-end, as well as retrieving results and hydrating objects in the client to represent them.

After running into a number of problems with the current CTP of RIA Services (see my article), I decided to fall back on Data Services to provide data access in my newest project.  Data Services has the advantage of allowing you to write fairly normal Linq queries against Entity Framework entity sets, and entity data models can reside in a dedicated data model assembly (instead of requiring them to be part of the web project).

One of the differences that remain when using Data Services in Silverlight—as opposed to accessing an Entity Framework ObjectContext directly—is that Silverlight doesn’t allow asynchronous calls.  So code like this, which would force a synchronous call (with FirstOrDefault), will fail in Silverlight:

var result = (from p in context.Properties
              where p.Required
              select p).FirstOrDefault();

This forces us to adopt some new patterns for data access.  This isn’t a bad thing, however.  And it’s an inevitable transition we’re making to asynchronous, concurrent program logic.

Here’s a typical example of querying data with Data Services in Silverlight:

var RequiredProperties = from p in context.Properties
                         where p.Required
                         select p;

var dsq = RequiredProperties as DataServiceQuery<Node>;
dsq.BeginExecute(ar =>
    {
        var result = dsq.EndExecute(ar);
        // do something with the the result
    }, null);

When using a lambda statement for brevity, the syntax isn’t too bad, but the pattern gets a little more involved when you include error handling logic.  If EndExecute fails, you’ll need the ability to perform some compensating action.

So what I’ve done to keep my client code simple is to define an extension method called AssignAsync that encapsulates this whole pattern.

public static class DataServicesExtensions
{
    public static void AssignAsync<T>(this IEnumerable<T> expression, 
        Action<IEnumerable<T>> Assignment, 
        Action<Exception> Fail)
    {
        var dsq = expression as DataServiceQuery<T>;
        dsq.BeginExecute(ar =>
            {
                IEnumerable<T> result = null;
                try
                {
                    result = dsq.EndExecute(ar) as IEnumerable<T>;
                }
                catch (Exception ex)
                {
                    Fail(ex);
                    return;
                }
                Assignment(result);
            }, null);
    }
}

This enables me to write the following code:

var RequiredProperties = from p in context.Properties
                         where p.Required
                         select p;
RequiredProperties.AssignAsync(result => properties = result, 
    ex => Debug.WriteLine(ex));

In other words: if the query succeeds, assign the result to the properties collection; if it fails, send the exception object to Debug output.  Either action can be used to send signals to other parts of your application that will respond appropriately.  Instead of Debug.WriteLine, you might add the exception object to some collection that triggers an error dialog to appear and your logging framework to record the event.  Instead of assigning the result to a simple collection, you could convert it to an ObservableCollection and assign it to an ItemsControl in WPF or Silverlight.  Anything is possible.

As I explore Data Services further, I will be looking for ways to share query and other model-centric logic between Silverlight and non-Silverlight clients.  I suspect that the same asynchronous patterns can be used in non-Silverlight projects as well, and that those projects will benefit from this query style.

Posted in ADO.NET Data Services, Design Patterns | Leave a Comment »

Problems with RIA Services (Feedback for July 2009 CTP)

Posted by Dan Vanderboom on November 9, 2009

RIA Services (new home page) is a collection of tools and libraries for making Rich Internet Applications, especially line of business applications, easier to develop.  Brad Abrams did a great presentation of RIA Services at MIX 2009 that touches on querying, validation, authentication, and how to share logic between the server and client sides.  Brad also has a huge series of articles (26 as I write this) on using Silverlight and RIA Services to build a realistic application.

I love the concept of RIA Services.  Brad and his team have done a fantastic job of identifying the critical issues for LOB systems and have the right idea to simplify those common data access tasks through the whole pipeline from database to UI controls, using libraries, Visual Studio tooling, or whatever it takes to get the job done.

So before I lay down some heavy criticism of RIA Services, take into consideration that it’s still a CTP and that my scenario pushes the boundaries of what was likely conceived of for this product, at least for such an early stage.

Shared Data Model with WPF & Silverlight Clients

The cause of so much of my grief with RIA Services has been my need to share a data model, and access to a shared database, across WPF as well as Silverlight client applications.  Within the constraints of this situation, I keep running into problem after problem while trying to use RIA Services productively.

The intuitive thing to do is: define a single data model project that compiles to a single assembly, and then reference that in my Silverlight and non-Silverlight projects.  This would be a 100% full-fidelity shared data model.  As long as the code I wrote was a subset of both Silverlight and normal .NET Frameworks (an intersection), we could share identical types and write complex validation and model manipulation logic, all without having to constrain ourselves to work within the limitations of a convoluted code generation scheme.  Back when I wrote Compact Framework applications, I did this with great success despite the platform gap, and I didn’t have anything like RIA Services to help.

Incompatible Assemblies

Part of the problem arises because Silverlight assemblies are incompatible with non-Silverlight assemblies.  A lot of what RIA Services is doing is trying to find a way around this limitation: picking up attributes and code files from one project and inserting that code into the Silverlight project with a build action.  This Visual Studio “magic” has been criticized for its weakness in dealing with multiple-solution systems where Visual Studio can’t update the client because it’s not loaded, and I’ve heard there’s work being done to address this, but for my current needs, this magic aspect of it isn’t a problem.  The specifics of how it works, however, are.

Different Data Access APIs

Accessing entities requires a different API in Silverlight via RiaContextBase versus ObjectContext elsewhere.  Complex logic in the model (for validation and other actions against the model) requires access to other entities and therefore access to the current object context, but the context APIs for Silverlight and WPF are very different.  Part of this has to do with Silverlight’s inability to make synchronous calls to the server.

In significantly large systems that I build, I use validation logic such as “this entity is valid if it’s pointing to an entity of a different type that contains a PropertyX value of Y”.  One of my tables stores a tree of data, so I have methods for loading entire subtrees and ensuring that no circular references exist.  For these kinds of tasks, I need access to the data context in basic validation methods.  When I delete nodes from a tree, I need to delete child nodes, so update logic is part of the model that needs to be the same in every client.  I don’t want to define that multiple times for multiple clients.  I like to program very DRY.  In other words, I find myself in need of a shared model.

RIA Services doesn’t provide anything like type equivalence for a shared model, however.  Data model classes in Silverlight inherit from Entity, but EntityObject in WPF.  In the RIA Services domain context, we RaiseDataMemberChanged, but in a normal EF object context, we need to ReportPropertyChanged.  In WPF, I can call MyEntity.Load(MergeOption.PreserveChanges), but in Silverlight there’s no Load method on the entity and no MergeOption enum.  In WPF I can query against context.SomeEntitySet, but in Silverlight you would query against context.GetSomeEntitySetQuery() and then execute the query with another method call.

This chasm of disparity makes all but the simplest shared model logic impractical and frustrating.  The code generation technique, though good in principle, keeps getting in the way.  For example, I have both parameterless and parameterized constructors in my entity classes.  This works great in my WPF client, but when this code is synchronized to my Silverlight client, I get an error because the Silverlight-side entity class is generated in two parts: in the hidden partial class, a parameterless constructor is generated which calls partial method OnCreated; and in the visible partial class, the constructor method I defined on the server is dumped into another file, so I have duplicate constructors.  If I remove the parameterless constructor from the server side, I get an error because my entity class requires a parameterless constructor (and defining a non-default constructor effectively removes the default one from the resulting type unless it’s explicitly defined).  I thought I could define the partial method OnCreated and put my construction logic in there, but the partial method is only defined on the client side.  That means sharing construction logic consists of copying and pasting the OnCreated method across the various clients—far from an ideal solution.

Entity Data Model Required to be in Web Project

Another strategy I attempted was to define the .edmx file and my partial class extensions in a class library, and then reference that from the web project.  I could define the LinqToEntitiesDomainService<MyDataContext>, but sharing entity class code (by generating code in the Silverlight project) isn’t possible unless the .edmx file and partial class extensions are defined in the web project itself.  This would mean that my WPF client would have to reference a web project for data access, which by itself seems wrong.  (Or making a copy of the data model, which is worse.)  It would be better for the WPF client to talk to the same domain service as the Silverlight client, but RIA Services doesn’t give you an option to link that web project to a non-Silverlight project, so again I ran into a brick wall.

So Don’t Do That

The kind of advice I’m getting for this is, “so don’t do that”.  In other words, don’t write complex validation logic in the model or otherwise try to access the data context; don’t write parameterized constructors; don’t aim for 100% type fidelity across all endpoints of a system; don’t try to share data models with Silverlight and non-Silverlight projects, etc.  But I see the potential for RIA Services, so I have to push for these things unless I hear really convincing arguments against them (or compelling alternatives).

Conclusion

The fact that there are different data contexts and data item definitions within those contexts imposes a burden on the developer to use different techniques for each environment, and creates challenges for centralizing data model logic and reusing equivalent logic across different kinds of clients.  My gut feeling is that RIA Services in its current form has some fundamental design flaws that will need to be addressed, taking into consideration systems with a mix of Silverlight, WPF, and other clients, before it becomes a truly robust data access platform.

Posted in Data Structures, Design Patterns, Distributed Architecture, LINQ, RIA Services, Software Architecture | 3 Comments »

Filtering with Metadata in the Managed Extensibility Framework

Posted by Dan Vanderboom on September 19, 2009

The Managed Extensibility Framework (MEF) is the new extensibility framework from Microsoft.  Pioneered by Glenn Block in the patterns & practices group, and leveraged by the behemoth Visual Studio 2010, it has a striking resemblance to my own Inversion of Control (IoC) and Dependency Injection (DI) framework—which led to me to have a couple great conversations about IoC with Glenn at Tech Ed 2008 and then again at PDC 2008.

But MEF isn’t really written to be your IoC.  Instead, the IoC engine and DI aspects are implementation details, allowing you to do really no more than “MEF things together”.  The core concept of MEF is to provide very simple and powerful application composability.  Not in the user interface composition sense—for that, see Prism for WPF and Silverlight (explained in MSDN Magazine, September 2008)—but for virtually all other dynamic component assembly needs, MEF is your best friend.

The two things I like most about MEF is its simplicity as its lack of presumption on how it will be used.  Compose collections of strings, single method delegates, or implementations of complex services.  All you’re doing is importing and exporting things, with little code required to wire things up.

MEF is currently in its seventh preview release, so expect beta-like quality.  My own experience with it has been very positive, but there are a number of shortcomings in the API.  This article is about a few of them and what can be done to add some much-needed functionality.

System.AddIn vs. MEF

There’s been some confusion with Microsoft coopetition among products with similar aims, and extensibility and composition are no exception.  The AddIn API (team blog) serves a similar purpose as MEF.  (See this two-part MSDN article on System.AddIn: first and second.)  The primary differentiator, from my understanding, is that the AddIn API is a bit more robust and a lot more complicated, and supports such things as isolating extensions in separate AppDomains.

With Visual Studio siding with MEF, it’s personally hard for me to imagine using the AddIn API.  If MEF is flexible and robust enough for Visual Studio, is it really likely to fall short for my own much smaller software systems?  Krzysztof Cwalina suggests they are complementary approaches, but I find that hard to swallow.  Why would I want to use two different extensibility frameworks instead of one coherent API?  If anything, I imagine that the lessons learned from the AddIn API will eventually migrate to MEF.

Daniel Moth notes that with the AddIn API, “there are many design decisions to make and quite a few subtleties in implementing those decisions in particular when it comes to discovering addins, version resiliency, isolation from the host etc.”  A customer of mine using the AddIn API was using a Visual Studio plug-in to manage pipelines, and things were a real mess.  There were a bunch of assemblies, a lot of generated code, and not much clarity or confidence that it was all really necessary.

MEF: Import & ImportMany

In MEF, the Import attribute allows you to inject a value that is exported somewhere else using the Export attribute—typically from another assembly.  There is also an ImportMany attribute which is useful when you expect several exports that use the same contract.  By defining an IEnumerable<T> field or property and decorating it with the ImportMany attribute, all matching exports will be added to an enumerable type.

[ImportMany]
public IEnumerable<IVehicle> Vehicles;

What if you want to filter the exported vehicle types by some kind of metadata, though?  Let’s take a look at the IVehicle contract and some concrete classes that implement the contract.

public interface IVehicle { }

[Export(typeof(IVehicle))]
[ExportMetadata("Speed", "Slow")]
public class ToyotaPrius : IVehicle
{
    public ToyotaPrius() { }
}

[Export(typeof(IVehicle))]
[ExportMetadata("Speed", "Fast")]
public class LamborghiniDiablo : IVehicle
{
    public LamborghiniDiablo() { }
}

The object model isn’t very interesting, but that’s not the point.  What is interesting is that MEF allows us to supply metadata corresponding to our exports.  In this case, my contrived example has defined a metadata variable of “Speed”, with two possible values: “Fast” and “Slow”.  The variable name must be a string, but its value can be any value; that is, any value that’s supported from within an attribute, which means string literals and constants, type objects, and the like.

Filtering Imports on Metadata

What if you want to ImportMany for all exports that have a particular metadata value?  Unfortunately, there are no such options in the ImportMany attribute class.

In my scenario, I’ve defined a static factory class called VehicleFactory, which at some imaginary point in the future will be responsible for building a city full of trafic.

public static class TrafficFactory
{
    // type initialization fails without a static constructor
    static TrafficFactory() { }

    public static IEnumerable<IVehicle> SlowVehicles =
        App.Container.GetExportedValues<IVehicle>(metadata => metadata.ContainsKeyWithValue("Speed", "Slow"));

    public static IEnumerable<IVehicle> FastVehicles =
        App.Container.GetExportedValues<IVehicle>(metadata => metadata.ContainsKeyWithValue("Speed", "Fast"));

    public static IDictionary<object, IVehicle> AllVehicles =
        App.Container.GetKeyedExportedValues<IVehicle>("Speed");
}

This is what I want to do, but there is no overload of GetExportedValues that supplies a metadata-dependent predicate function.  Adding one is easy, though.  While we’re at it, we’ll also add the ContainsKeyWithValue which I borrow from The Code Junky article also on MEF container filtering.

public static class IDictionaryExtensions
{
    public static bool ContainsKeyWithValue<KeyType, KeyValue>(
        this IDictionary<KeyType, ValueType> Dictionary,
        KeyType Key, ValueType Value)
    {
        return (Dictionary.ContainsKey(Key) && Dictionary[Key].Equals(Value));
    }
}

public static class MEFExtensions
{
    public static IEnumerable<T> GetExportedValues<T>(this CompositionContainer Container,
        Func<IDictionary<string, object>, bool> Predicate)
    {
        var result = new List<T>();

        foreach (var PartDef in Container.Catalog.Parts)
        {
            foreach (var ExportDef in PartDef.ExportDefinitions)
            {
                if (ExportDef.ContractName == typeof(T).FullName)
                {
                    if (Predicate(ExportDef.Metadata))
                        result.Add((T)PartDef.CreatePart().GetExportedValue(ExportDef));
                }
            }
        }

        return result;
    }
}

Now we can test this logic by wiring up MEF and then accessing the two filtered collections of cars, which will each contain a single IVehicle instance.

class App
{
    [Export]
    public CompositionContainer Container;

    static void Main(string[] args)
    {
        AssemblyCatalog catalog = new AssemblyCatalog(Assembly.GetExecutingAssembly());
        Container = new CompositionContainer(catalog);
        Container.ComposeParts();

        var FastCars = TrafficFactory.FastVehicles;
        var SlowCars = TrafficFactory.SlowVehicles;
    }
}

Viola!  We have metadata-based filtering.

You’ll also noticed that I added an Export attribute to the Container itself.  By doing this, you can Import the container into any module that gets dynamically loaded.  It’s not used in this article, but getting to the container from a module is otherwise impossible without some kind of work-around.  (Thanks for pointing out the problem, Damon.)

Using Metadata to Assign Dictionary Keys

Let’s take this one step further.  Let’s say you want to import many instances of MEF exported values into a Dictionary, using one of the metadata properties as the key.  This is how I’d like it to work:

public static IDictionary<object, IVehicle> AllVehicles =
    App.Container.GetKeyedExportedValues<IVehicle>("Speed");

Again, the current MEF Preview doesn’t support this, but another extension method is all we need.  We’ll add two, so that one version gives us all exported values and the other allows us to filter that selection based on other metadata.

public static IDictionary<object, T> GetKeyedExportedValues<T>(this CompositionContainer Container,
    string MetadataKey, Func<IDictionary<string, object>, bool> Predicate)
{
    var result = new Dictionary<object, T>();

    foreach (var PartDef in Container.Catalog.Parts)
    {
        foreach (var ExportDef in PartDef.ExportDefinitions)
        {
            if (ExportDef.ContractName == typeof(T).FullName)
            {
                if (Predicate(ExportDef.Metadata))
                    result.Add(ExportDef.Metadata[MetadataKey], 
                        (T)PartDef.CreatePart().GetExportedValue(ExportDef));
            }
        }
    }

    return result;
}

public static IDictionary<object, T> GetKeyedExportedValues<T>(this CompositionContainer Container,
    string MetadataKey)
{
    return GetKeyedExportedValues<T>(Container, MetadataKey, metadata => true);
}

Add an assignment to TrafficFactory.AllVehicles in the App.Main method and see for yourself that it works.

If you’re using metadata values as Dictionary keys, it’s probably important for you not to mess them up.  I recommend using enum values for both metadata property names as well as valid values if it’s possible to enumerate them, and string const values otherwise.

Now go forth and start using MEF!

Posted in Algorithms, Component Based Engineering, Composability, Design Patterns, Visual Studio Extensibility | Tagged: , , , , | 4 Comments »

Better Tool Support for .NET

Posted by Dan Vanderboom on September 7, 2009

Productivity Enhancing Tools

Visual Studio has come a long way since its debut in 2002.  With the imminent release of 2010, we’ll see a desperately-needed overhauling of the archaic COM extensibility mechanisms (to support the Managed Package Framework, as well as MEF, the Managed Extensibility Framework) and a redesign of the user interface in WPF that I’ve been pushing for and predicted as inevitable quite some time ago.

For many alpha geeks, the Visual Studio environment has been extended with excellent third-party, productivity-enhancing tools such as CodeRush and Resharper.  I personally feel that the Visual Studio IDE team has been slacking in this area, providing only very weak support for refactorings, code navigation, and better Intellisense.  While I understand their desire to avoid stepping on partners’ toes, this is one area I think makes sense for them to be deeply invested in.  In fact, I think a new charter for a Developer Productivity Team is warranted (or an expansion of their team if it already exists).

It’s unfortunately a minority of .NET developers who know about and use these third-party tools, and the .NET community as a whole would without a doubt be significantly more productive if these tools were installed in the IDE from day one.  It would also help to overcome resistance from development departments in larger organizations that are wary of third-party plug-ins, due perhaps to the unstable nature of many of them.  Microsoft should consider purchasing one or both of them, or paying a licensing fee to include them in every copy of Visual Studio.  Doing so, in my opinion, would make them heroes in the eyes of the overwhelming majority of .NET developers around the world.

It’s not that I mind paying a few hundred dollars for these tools.  Far from it!  The tools pay for themselves very quickly in time saved.  The point is to make them ubiquitous: to make high-productivity coding a standard of .NET development instead of a nice add-on that is only sometimes accepted.

Consider just from the perspective of watching speakers at conferences coding up samples.  How many of them don’t use such a tool in their demonstration simply because they don’t want to confuse their audience with an unfamiliar development interface?  How many more demonstrations could they be completing in the limited time they have available if they felt more comfortable using these tools in front of the masses?  You know you pay good money to attend these conferences.  Wouldn’t you like to cover significantly more ground while you’re there?  This is only likely to happen when the tool’s delivery vehicle is Visual Studio itself.  Damon Payne makes a similar case for the inclusion of the Managed Extensibility Framework in .NET Framework 4.0: build it into the core and people will accept it.

The Gorillas in the Room

CodeRush and Resharper have both received recent mention in the Hanselminutes podcast (episode 196 with Mark Miller) and in the Deep Fried Bytes podcast (episode 35 with Corey Haines).  If you haven’t heard of CodeRush, I recommend watching these videos on their use.

For secondary information on CodeRush, DXCore, and the principles with which they were designed, I recommend these episodes of DotNetRocks:

I don’t mean to be so biased toward CodeRush, but this is the tool I’m personally familiar with, has a broader range of functionality, and it seems to get the majority of press coverage.  However, those who do talk about Resharper do speak highly of it, so I recommend you check out both of them to see which one works best for you.  But above all: go check them out!

Refactor – Rename

Refactoring code is something we should all be doing constantly to avoid the accumulation of technical debt as software projects and the requirements on which they are based evolve.  There are many refactorings in Visual Studio for C#, and many more in third-party tools for several languages, but I’m going to focus here on what I consider to be the most important refactoring of them all: Rename.

Why is Rename so important?  Because it’s so commonly used, and it has such far-reaching effects.  It is frequently the case that we give poor names to identifiers before we clearly understand their role in the “finished” system, and even more frequent that an item’s role changes as the software evolves.  Failure to rename items to accurately reflect their current purpose is a recipe for code rot and greater code maintenance costs, developer confusion, and therefore buggy logic (with its associated support costs).

When I rename an identifier with a refactoring tool, all of the references to that identifier are also updated.  There might be hundreds of references.  In the days before refactoring tools, one would accomplish this with Find-and-Replace, but this is dangerous.  Even with options like “match case” and “match whole word”, it’s easy to rename the wrong identifiers, rename pieces of string literals, and so on; and if you forget to set these options, it’s worse.  You can go through each change individually, but that can take a very long time with hundreds of potential updates and is a far cry from a truly intelligent update.

Ultimately, the intelligence of the Rename refactoring provides safety and confidence for making far-reaching changes, encouraging more aggressive refactoring practices on a more regular basis.

Abolishing Magic Strings

I am intensely passionate about any tool or coding practice that encourages refactoring and better code hygiene.  One example of such a coding practice is the use of lambda expressions to select identifiers instead of using evil “magical strings”.  From my article on dynamically sorting Linq queries, the use of “magic strings” would force me to write something like this to dynamically sort a Linq query:

Customers = Customers.Order("LastName").Order("FirstName", SortDirection.Descending);

The problem here is that “LastName” and “FirstName” are oblivious to the Rename refactoring.  Using the refactoring tool might give me a false sense of security in thinking that all of my references to those two fields have been renamed, leading me to The Pit of Despair.  Instead, I can define a function and use it like the following:

public static IOrderedEnumerable<T> Order<T>(this IEnumerable<T> Source, 
    Expression<Func<T, object>> Selector, SortDirection SortDirection)
{
    return Order(Source, (Selector.Body as MemberExpression).Member.Name, SortDirection);
}

Customers = Customers.Order(c => c.LastName).Order(c => c.FirstName, SortDirection.Descending);

This requires a little understanding of the structure of expressions to implement, but the benefit is huge: I can now use the refactoring tool with much greater confidence that I’m not introducing subtle reference bugs into my code.  For such a simple example, the benefit is dubious, but multiply this by hundreds or thousands of magic string references, and the effort involved in refactoring quickly becomes overwhelming.

Coding in this style is most valuable when it’s a solution-wide convention.  So long as you have code that strays from this design philosophy, you’ll find yourself grumbling and reaching for the inefficient and inelegant Find-and-Replace tool.  The only time it really becomes an issue, then, is when accessing libraries that you have no control over, such as the Linq-to-Entities and the Entity Framework, which makes extensive use of magic strings.  In the case of EF, this is mitigated somewhat by your ability to regenerate the code it uses.  In other libraries, it may be possible to write extension methods like the Order method shown above.

It’s my earnest hope that library and framework authors such as the .NET Framework team will seriously consider alternatives to, and an abolition of, “magic strings” and other coding practices that frustrate otherwise-powerful refactoring tools.

Refactoring Across Languages

A tool is only as valuable as it is practical.  The Rename refactoring is more valuable when coding practices don’t frustrate it, as explained above.  Another barrier to the practical use of this tool is the prevalence of multiple languages within and across projects in a Visual Studio solution.  The definition of a project as a single-language container is dubious when you consider that a C# or VB.NET project may also contain HTML, ASP.NET, XAML, or configuration XML markup.  These are all languages with their own parsers and other language services.

So what happens when identifiers are shared across languages and a Rename refactoring is executed?  It depends on the languages involved, unfortunately.

When refactoring a C# class in Visual Studio, the XAML’s x:Class value is also updated.  What we’re seeing here is cross-language refactoring, but unfortunately it only works in one direction.  There is no refactor command to update the x:Class value from the XAML editor, so manually changing it causes my C# class to become sadly out of sync.  Furthermore, this seems to be XAML specific.  If I refactor the name of an .aspx.cs class, the Inherits attribute of the Page directive in the .aspx file doesn’t update.

How frequent do you think it is that someone would want to change a code-behind file for an ASP.NET page, and yet would not want to change the Inherits attribute?  Probably not very common (okay, probably NEVER).  This is a matter of having sensible defaults.  When you change an identifier name in this way, the development environment does not respond in a sensible way by default, forcing the developer to do extra work and waste time.  This is a failure in UI design for the same reason that Intellisense has been such a resounding success: Intellisense anticipates our needs and works with us; the failure to keep identifiers in sync by default is diametrically opposed to this intelligence.  This represents a fragmented and inconsistent design for an IDE to possess, thus my hope that it will be addressed in the near future.

The problem should be recognized as systemic, however, and addressed in a generalized way.  Making individual improvements in the relationships between pairs of languages has been almost adequate, but I think it would behoove us to take a step back and take a look at the future family of languages supported by the IDE, and the circumstances that will quickly be upon us with Microsoft’s Oslo platform, which enables developers to more easily build tool-supported languages (especially DSLs, Domain Specific Languages). 

Even without Oslo, we have seen a proliferation of languages: IronRuby, IronPython, F#, and the list goes on.  A refactoring tool that is hard-coded for specific languages will be unable to keep pace with the growing family of .NET and markup languages, and certainly unable to deal with the demands of every DSL that emerges in the next few years.  If instead we had a way to identify our code identifiers to the refactoring tool, and indicate how they should be bound to identifiers in other languages in other files, or even other projects or solutions, the tools would be able to make some intelligent decisions without understanding each language ahead of time.  Each language’s language service could supply this information.  For more information on Microsoft Oslo and its relationship to a world of many languages, see my article on Why Oslo Is Important.

Without this cross-language identifier binding feature, we’ll remain in refactoring hell.  I offered a feature suggestion to the Oslo team regarding this multi-master synchronization of a model across languages that was rejected, much to my dismay.  I’m not sure if the Oslo team is the right group to address this, or if it’s more appropriate for the Visual Studio IDE team, so I’m not willing to give up on this yet.

A Default of Refactor-Rename

The next idea I’d like to propose here is that the Rename refactoring is, in fact, a sensible default behavior.  In other words, when I edit an identifier in my code, I more often than not want all of the references to that identifier to change as well.  This is based on my experience in invoking the refactoring explicitly countless times, compared to the relatively few times I want to “break away” that identifier from all the code that references.

Think about it: if you have 150 references to variable Foo, and you change Foo to FooBar, you’re going to have 150 broken references.  Are you going to create a new Foo variable to replace them?  That workflow doesn’t make any sense.  Why not just start editing the identifier and have the references update themselves implicitly?  If you want to be aware of the change, it would be trivial for the IDE to indicate the number of references that were updated behind the scenes.  Then, if for some reason you really did want to break the references, you could explicitly launch a refactoring tool to “break references”, allowing you to edit that identifier definition separately.

The challenge that comes to mind with this default behavior concerns code that spans across solutions that aren’t loaded into the IDE at the same time.  In principle, this could be dealt with by logging the refactoring somewhere accessible to all solutions involved, in a location they can all access and which gets checked into source control.  The next time the other solutions are loaded, the log is loaded and the identifiers are renamed as specified.

Language Property Paths

If you’ve done much development with Silverlight or WPF, you’ve probably run into the PropertyPath class when using data binding or animation.  PropertyPath objects represent a traversal path to a property such as “Company.CompanyName.Text”.  The travesty is that they’re always “magic strings”.

My argument is that the property path is such an important construct that it deserves to be an core part of language syntax instead of just a type in some UI-platform-specific library.  I created a data binding library for Windows Forms for which I created my own property path syntax and type, and there are countless non-UI scenarios in which this construct would also be incredibly useful.

The advantage of having a language like C# understand property path syntax is that you avoid a whole class of problems that developers have used “magic strings” to solve.  The compiler can then make intelligent decisions about the correctness of paths, and errors can be identified very early in the cycle.

Imagine being able to pass property paths to methods or return then from functions as first-class citizens.  Instead of writing this:

Binding NameTextBinding = new Binding("Name") { Source = customer1; }

… we could write something like this, have access to the Rename refactoring, and even get Intellisense support when hitting the dot (.) operator:

Binding NameTextBinding = new Binding(@Customer.Name) { Source = customer1; }

In this code example, I use the fictitious @ operator to inform the compiler that I’m specifying a property path and not trying to reference a static property called Name on the Customer class.

With property paths in the language, we could solve our dynamic Linq sort problem cleanly, without using lambda expressions to hack around the problem:

Customers = Customers.Order(@Customer.LastName).Order(@Customer.FirstName, SortDirection.Descending);

That looks and feels right to me.  How about you?

Summary

There are many factors of developer productivity, and I’ve established refactoring as one of them.  In this article I discussed tooling and coding practices that support or frustrate refactoring.  We took a deep look into the most important refactoring we have at our disposal, Rename, and examined how to get the greatest value out of it in terms of personal habits, as well as long-term tooling vision and language innovation.  I proposed including property paths in language syntax due to its general usefulness and its ability to solve a whole class of problems that have traditionally been solved using problematic “magic strings”.

It gives me hope to see the growing popularity of Fluent Interfaces and the use of lambda expressions to provide coding conventions that can be verified by the compiler, and a growing community of bloggers (such as here and here) writing about the abolition of “magic strings” in their code.  We can only hope that Microsoft program managers, architects, and developers on the Visual Studio and .NET Framework teams are listening.

Posted in Data Binding, Data Structures, Design Patterns, Development Environment, Dynamic Programming, Functional Programming, Language Innovation, LINQ, Oslo, Silverlight, Software Architecture, User Interface Design, Visual Studio, Visual Studio Extensibility, Windows Forms | Leave a Comment »

Strongly-Typed, Dynamic Linq Order Operator

Posted by Dan Vanderboom on August 20, 2009

A Community Solution

I love social technologies like Stack Overflow, where people can collaborate loosely to share knowledge and help get things done.  Stack Overflow does on a large scale what developer blogs like mine have been doing on a smaller scale: creating a community around the sharing of ideas and methods.

Every once in a while, I get some great feedback that includes a fix for one of my bugs, a performance tweak I didn’t realize was possible, or an extension to some library I left unfinished.

This morning I ran into a question about my dynamic Linq sort, solved and answered by “Ch00k”, allowing one to get compile-time checking of identifier names.  Well done!

(It’s too bad Stack Overflow doesn’t promote the use of real names for professional developers to better market themselves with skill and reputation.)

My original article (with source code) is here.  All I added to the library was this:

public static IOrderedEnumerable<T> Order<T>(this IEnumerable<T> Source, 
    Expression<Func<T, object>> Selector, SortDirection SortDirection)
{
    return Order(Source, (Selector.Body as MemberExpression).Member.Name, SortDirection);
}

To test it, I used this code:

IEnumerable<Customer> Customers = new Customer[] { new Customer("Dan", "Vanderboom"), new Customer("Steve", "Vanderboom"), 
    new Customer("Tracey", "Vanderboom"), new Customer("Sarah", "Barkelew") };

Customers = Customers.Order(c => c.LastName, SortDirection.Ascending);
Customers = Customers.Order(c => c.FirstName, SortDirection.Descending);

foreach (var cust in Customers)
{
    Console.WriteLine(cust.FirstName + " " + cust.LastName);
}

Now I can refactor these data model classes with a tool and all my dynamic sorting queries will stay in sync!

Posted in Collaboration, Design Patterns, Dynamic Programming, Language Extensions, LINQ, Object Oriented Design, Open Source, Social Networking | Tagged: , , , | 3 Comments »