Archive for October, 2010

The Archetype Language (Part 9)

Posted by Dan Vanderboom on October 3, 2010

Overview

This is part of a continuing series of articles about a new .NET language under development called Archetype. Archetype is a C-style (curly brace) functional, object-oriented (class-based), metaprogramming-capable language with features and syntax borrowed from many languages, as well as some new constructs. A major design goal is to succinctly and elegantly implement common patterns that normally require a lot of boilerplate code which can be difficult, error-prone, or just plain onerous to write.

You can follow the news and progress on the Archetype compiler on twitter @archetypelang.

Links to the individual articles:

Part 1 – Properties and fields, function syntax, the me keyword

Part 2 – Start function, named and anonymous delegates, delegate duck typing, bindable properties, composite bindings, binding expressions, namespace imports, string concatenation

Part 3 – Exception handling, local variable definition, namespace imports, aliases, iteration (loop, fork-join, while, unless), calling functions and delegates asynchronously, messages

Part 4 – Conditional selection (if), pattern matching, regular expression literals, agents, classes and traits

Part 5 – Type extensions, custom control structures

Part 6 – If expressions, enumerations, nullable types, tuples, streams, list comprehensions, subrange types, type constraint expressions

Part 7 – Semantic density, operator overloading, custom operators

Part 8 – Constructors, declarative Archetype: the initializer body

Part 9 – Params & fluent syntax, safe navigation operator, null coalescing operators

Conceptual articles about language design and development tools:

Language Design: Complexity, Extensibility, and Intention

Reimagining the IDE

Better Tool Support for .NET

Params & Fluent Syntax

C# has a parameter modifier called params that allows you to supply additional function arguments to populate a single array parameter.

void Display(params string[] Names)
{
// …
}

Without the params modifier, we’d have to call it like this:

Display(new string[] { "Dan", "Josa", "Sarah" });

Because params is declared, we can do this instead:

Display("Dan", "Josa", "Sarah");

If there’s one thing you can take away from Archetype’s design, it’s that syntactic sugar is everything. After examining my own procedural animation library (Animate.NET) to see how it could be used best in Archetype, I came to the conclusion that these params parameters can be substantial. When they are, they create syntactic unpleasantries, especially when nested structures are involved.

Consider the following C# example.

var anim =

Animate.Wait(0.2.seconds(),

RedChip.MoveBy(0.4.seconds(), -40, 0),

RedChip.FadeIn(0.2.seconds()),

BlackChip.MoveBy(0.4.seconds(), 0, 40),

BlackChip.FadeOut(0.4.seconds())

)

.WhenComplete(a =>

{

MainStage.Children.Remove(RedChip);

MainStage.Children.Remove(BlackChip);

})

.Begin();

First, a quick explanation of the code. Animate is a static class, and the Wait function returns an object called GroupAnimation that inherits from Animation. After a 0.2 second wait, the following params list of Animation objects will execute. RedChip and BlackChip are FrameworkElements (Silverlight/WPF objects), and animation commands such as MoveBy and FadeOut are extension methods on FrameworkElement. Each of these animation commands returns an Animation-derived object. The seconds() extension method on int and float types convert to TimeSpan objects.

The ultimate goal of this first Wait section of code is to define a set of animations—nested sets are possible, which form a tree of animations. These trees can get more complicated than this, but we’ll keep the example simple for now.

Now for the criticism. Look at the matching parentheses of the Wait function. The normal TimeSpan parameter is listed as an equal along with the Animation parameter list, and what is being used as a complex, nested structure is holding up the closing parenthesis and dragging it down to the end of the entire list. If only there were a cleaner way of treating this nested structure like constructor initializers (see Part 8). These correspond, in terms of visual layout, to the attributes and the child elements of an XML node.

What else is wrong with this picture? The .WhenComplete and .Begin functions are being invoked on the result of the previous expression. It’s characteristic of fluent-style APIs to define functions (or extension methods) that operate on the result of the previous operation so they can be strung together into sentence-like patterns. The dot before both WhenComplete and Begin look odd when appearing on lines by themselves, and the lambda expression would be better promoted to a proper code block.

Finally, it’s unfortunate that in declaring a new local variable, we have to indent the whole animation block this way. Here’s what the same code looks like in Archetype:

Animate.Wait (0.2 seconds) -> anim

{

RedChip.MoveBy(0.4 seconds, -40, 0),

RedChip.FadeIn(0.2 seconds),

BlackChip.MoveBy(0.4 seconds, 0, 40),

BlackChip.FadeOut(0.4 seconds)

}

WhenComplete (a)

{

MainStage.Children.Remove(RedChip),

MainStage.Children.Remove(BlackChip)

}

Begin();

This is more like it. Notice the declarative assignment (declaration + assignment) with –> anim on the first line, and the way the parentheses can be closed after the TimeSpan object (see Part 7 on custom operators for an explanation of the syntax “0.2 seconds”). There’s no more need to indent the whole structure to make it line up nicely in an assignment. The following initializer code block (in curly braces) supplies Animation object values to the params parameter in the Wait function, and the WhenComplete and Begin functions don’t require a leading dot to operate on the previous expression (Intellisense would reflect these options).

The Archetype code is much cleaner. It’s easier to see where groups of constructs begin and end, enabling fluent-style APIs with arbitrarily-complicated nested structures to be easily constructed. Let’s take a look at one more example with a more deeply nested structure:

Animate.Group –> anim

{

RedChip.MoveBy(0.4 seconds, -40, 0),

RedChip.FadeIn(0.2 seconds),

BlackChip.MoveBy(0.4 seconds, 0, 40),

BlackChip.FadeOut(0.4 seconds),

Animate.Wait (0.4 seconds)

{

Animate.CrossFade(1.5 seconds, RedChip, BlackChip),

BlackChip.MoveTo(0.2 seconds, 20, 150)

}

Here, a GroupAnimation is defined that contains, as one of its child Animations, another GroupAnimation (created with the Wait function). The animation isn’t started in this case, so anim.Begin() can be called later, or anim could be composed into a larger animation somewhere. A peek at the function headers for Group and Wait functions should make the ease and power of this design clear.

static Animate object

{

// a stream is the only parameter

Group GroupAnimation (Animations Animation* params)

{

}

// a stream is the last parameter, so [ list ] syntax can still be used

Wait GroupAnimation (WaitTime TimeSpan, Animations Animation* params)

{

}

Because the class is static, individual members are assumed to be static as well.

The easiest way to support this would be to allow this initializer block to be used with a params parameter that’s declared last.

Null Coalescing Operators

The null coalescing operator in C# allows you to compare a value to null, and to supply a default value to use in its place. This is handy in scenarios like this:

var location = (cust.Address.City ?? "Unknown") + ", " + (cust.Address.State ?? "Unknown");

Chris Eargle makes a good point in his article suggesting a “null coalescing assignment operator” when making assignments such as:

cust.Address.City = cust.Address.City ?? "Unknown";

There should be a way to eliminate this redundancy. By combining null coalescing with assignment, we can do this:

cust.Address.City ??= "Unknown";

Groovy’s Elvis Operator serves a similar role, but operates on a value of false in addition to null.

Safe Navigation Operator

There are many situations where we find ourselves needing to check the value of a deeply nested member, but if we access it directly without first checking whether each part of the path is null, we get a NullReferenceException.

var city = cust.Address.City;

If either cust or Address are null, an exception will be thrown. To get around this problem, we have to do something like this in C#:

string city = null;

if (cust != null && cust.Address != null)

city = cust.Address.City;

The && operator is short-circuiting, which means that if the first boolean expression evaluates to false, the rest of the expression—which would produce a NullReferenceException—never gets executed. As tedious as this is, without short circuiting operators, our error-prevention code would be even longer.

Jeff Handley wrote a clever safe navigation operator of sorts for C#, using an extension method called _ that takes a delegate (supplied as a lambda). You can find that code here. In his code, he does return a null value when the path short circuits. As you can see, however, the limitations of C# cause this simple example to get confusing quickly, which you can see if we make City a non-primitive object as well:

var city = cust._(c => c.Address._(a => a.City.Name));

Groovy implements a Safe Navigation Operator in the language itself, which is cleaner:

var city = cust?.Address.City;

This is equivalent to the more verbose code above. Archetype takes a similar approach:

var city = cust..Address.City;

Because of the .. operator in this member access expression, the type of the city variable is Option<string> (more on Option types). If the path leading up to City is invalid (because Address is null), the value of city will be None. This works the same as Nullable<T>, except that None means “doesn’t have a value; not even null”.

I like to think of None as the “mu constant”. What is mu? It’s the Japanese word that variously means “not”, “doesn’t exist”, etc., and is illustrated by the well-known Zen Buddhist koan:

A monk asked Zhaozhou Congshen, a Chinese Zen master (known as Jōshū in Japanese), "Has a dog Buddha-nature or not?" Zhaozhou answered, "Wú" (in Japanese, Mu)

—The Gateless Gate, koan 1, translation by Robert Aitken

Yasutani Haku’un of the Sanbo Kyodan maintained that "the koan is not about whether a dog does or does not have a Buddha-nature because everything is Buddha-nature, and either a positive or negative answer is absurd because there is no particular thing called Buddha-nature.

In other words, Mu has often been used to mean “I disagree with the presuppositions of the question.”

There are a few basic patterns around options, nullable objects, and safe navigation that occur frequently, so I’ll outline them here with examples:

// if Address is null, this evaluates to false

if (cust..Address.City == "Milwaukee")

WorkHarder();

// if City is None because Address is null, set to "Address Missing"; otherwise, get the city text

var city = cust..Address.City ?! "Address Missing";

// if City is Some<string> and City == null, set to empty string

var city = cust..Address.City ?? string.Empty;

// if Address is null (City is None), set to "Address Not Found";

// but if City == null, set to empty string

var city = cust..Address.City ?! "Address Not Found" ?? string.Empty;

// if Address points to an object, leave it alone; otherwise, create a new object

cust.Address ??= new Address(City="Milwaukee");

// an assertion

cust..Address ?! new Exception("Address missing");

// set the city if possible, throw a specific exception if not

var city = cust..Address.City ?! new Exception("Address missing");

Summary

By now it should be obvious that Archetype aims to liberate the developer from the constraints and inefficiencies of ordinary programming languages. It is designed with modern practices in mind such as fluent-style development and declarative object graph construction.

This article wraps up the material started in Part 8 on declarative programming in Archetype. In addition, I introduced the safe navigation and null coelescing operators. These are simple but powerful language elements for cleanly and succinctly specifying common idioms that come up in daily coding.

Posted in Animation, Archetype Language, Composability, Design Patterns, Fluent API, Language Innovation | 4 Comments »

The Archetype Language (Part 8)

Posted by Dan Vanderboom on October 1, 2010

Overview

You can follow the news and progress on the Archetype compiler on twitter @archetypelang.

Links to the individual articles:

Part 1 – Properties and fields, function syntax, the me keyword

Part 2 – Start function, named and anonymous delegates, delegate duck typing, bindable properties, composite bindings, binding expressions, namespace imports, string concatenation

Part 3 – Exception handling, local variable definition, namespace imports, aliases, iteration (loop, fork-join, while, unless), calling functions and delegates asynchronously, messages

Part 4 – Conditional selection (if), pattern matching, regular expression literals, agents, classes and traits

Part 5 – Type extensions, custom control structures

Part 6 – If expressions, enumerations, nullable types, tuples, streams, list comprehensions, subrange types, type constraint expressions

Part 7 – Semantic density, operator overloading, custom operators

Part 8 – Constructors, declarative Archetype: the initializer body

Part 9 – Params & fluent syntax, safe navigation operator, null coalescing operators

Conceptual articles about language design and development tools:

Language Design: Complexity, Extensibility, and Intention

Reimagining the IDE

Better Tool Support for .NET

Constructors

A constructor in Archetype is a recommended, predefined prototype for instantiating an object correctly.

The default parameterless constructor is defined implicitly (it’s defined even if it isn’t written), even if other constructors are defined explicitly. This last part is unlike other languages that hide the parameterless constructor when others are defined. This will make classes with these default constructors common in Archetype, to more easily support behaviors like serialization and dynamic construction. When it needs to be hidden, it can be defined with reduced visibility, such as private.

A constructor is defined with the name new, consistent with how it’s invoked.

Let’s start with a very basic class, and build up to more complicated examples.

Customer object

{

FirstName string;

LastName string;

}

Despite the lack of an explicit constructor, it’s important for Archetype to define constructs that are useful in their default configurations. You couldn’t get more basic that the Customer class above. If we want to define the constructor explicitly, we can do so.

Customer object

{

FirstName string;

LastName string;

new ()

{

// do nothing

}

Instantiating a Customer object is easy. With the parameterless constructor, parentheses are optional.

var dilbert = new Customer;

Archetype, like C#, supports constructor initializers:

var dilbert = new Customer

{

FirstName = "Dilbert",

LastName = "Smith"

};

When you have few parameters and want to compress this call to a single line, the curly braces end up feeling a little too much (too formal?).

var dilbert = new Customer { FirstName = "Dilbert", LastName = "Smith" };

Archetype supports passing these assignment statements as final arguments of the constructor parameter list, like this:

var dilbert = new Customer(FirstName = "Dilbert", LastName = "Smith");

As a result, there isn’t much need to define constructors that only set fields and properties to the value of constructor parameters. Because Archetype has this mechanism for fluidly initializing objects at construction, the only time constructors really need to be defined is when construction of the object is complicated or unintuitive, in which case a supplied construction pattern is a sure way to make sure it’s done correctly. Our Customer example doesn’t meet those criteria, but if it did, this is one way we could write it:

Customer object

{

FirstName string;

LastName string;

new (FirstName string, LastName string)

{

this.FirstName = FirstName;

this.LastName = LastName;

}

To avoid having to qualify FirstName with the this keyword, many people prefer naming their parameters with the first character lower-cased. That’s an unfortunate compromise. When viewing at least the public members of a type, in a sense you’re creating an outward-facing API, and I think Pascal casing more naturally respects English grammar, not downplaying the signficance of the most-important first word in an identifier by lower-casing it to get around some unfortunate syntax limitation.

But instead of taking sides in a naming convention war, we can solve the problem in the language and remove the need to make any compromise.

new (FirstName string, LastName string)

{

set FirstName, LastName;

}

This lets us set individual properties named the same as constructor parameters. It’s flexible enough to set some and consume other parameters differently, but when you want to set all parameters with matching member names, you can use the shortcut set all. If that’s all the constructor needs to do, we can do away with the curly braces:

new (FirstName string, LastName string) set all;

If our Customer class contained a BirthDate property, we could use this constructor and pass in an initializer statement as a final parameter.

var dilbert = new Customer("Dilbert", "Smith", BirthDate = DateTime.Parse("7/4/1970");

This works with multiple initializers. Alternatively, we could use an initializer body after the parameter list:

var dilbert = new Customer("Dilbert", "Smith")

{

BirthDate = DateTime.Parse("7/4/1970")

};

Note how we have two places to supply data to a new object, if needed: the parameter list for simple, short values, and the initializer body for much larger assignments.

Another common construction pattern is for one or more constructors to call another constructor with a default set of properties. Typically the constructor with the full list of parameters performs the actual work, while the shorter constructors call into the main one, passing in some default values and passing the others through.

new (EvaluateFunc sFunc<T>) new(null, null, EvaluateFunc);

new (BaseObject object, EvaluateFuncsFunc<T>) new(null, BaseObject, EvaluateFunc);

new (Name string, BaseObject object, EvaluateFunc Func<T>)

{

set all;

// do all the real work…

// …

}

Declarative Archetype: The Initializer Body

The initializer body mentioned above has a special structure in Archetype. Member assignment statements can appear side-by-side with value expressions that are processed by a special function called value. This can be used, among other things, to add items to a collection. It’s best to see in an example:

var dilbert = new Customer

{

FirstName = "Dilbert",

LastName = "Smith",

BirthDate = DateTime.Parse("7/4/1970"),

new SalesOrder(OrderCode = "ORD012940"),

new SalesOrder

{

OrderCode = "ORD012941",

new SalesOrderLine(ItemCode = "S0139", Quantity = 3),

new SalesOrderLine(ItemCode = "S0142", Quantity = 1)

}

};

The first three lines of the initializer set members with assignment statements. The next expression (new SalesOrder …) in the list creates an object, but there’s no assignment. It returns a value, but where does it go? Take a look at the value functions below for the answer:

Customer object

{

FirstName string;

LastName string;

Orders SalesOrder* = new;

Invoices Invoice* = new;

// formatted inline

value (Order SalesOrder) Orders += Order;

// formatted with full code block

value (Invoice Invoice)

{

Invoices += Invoice;

}

A Customer has several collections of things–Orders and Invoices here–and because there are two value functions in the class, any expressions of type SalesOrder or Invoice will be evaluated and their values passed to the appropriate value function. Expressions of other types will trigger a compile-time error.

The += and -= operators haven’t been shown before. Their use is a very natural fit for stream and list types. The += operator appends an object to a stream, and -= removes the first occurrence of that object.

This simple addition of a value function in types (classes and structs) gives Archetype the ability to represent hierarchical structures in a clean, declarative way. Sure it’s always been possible to format expressions similarly, but the syntactic trappings of imperative languages have made this difficult and unattractive at best, and in most real-world cases impractical.

When I experimented in creating a Future class, I came up with a pattern in C# to nest structures in a tree for large future expressions, but the need to match parentheses gets in the way and consumes too much attention that’s better focused on the logic itself:

Future<string> FuturePi = null, FutureOmega = null, FutureConcat = null, FutureParen = null;

var result = new Future<string>("bracket",
    () => Bracket(FutureParen),
    (FutureParen = new Future<string>("parenthesize",
        () => Parenthesize(FutureConcat),
        (FutureConcat = new Future<String>("concat",
            () => FuturePi + " < " + FutureOmega,
                (FuturePi = new Future<string>("pi", () => CalculatePi(10))),
                (FutureOmega = new Future<string>("omega", () => CalculateOmega()))
            ))
        ))
    );

The difference finally occurred to me between the need to set few simple members and the definition of larger, more structured content–including nested structures–that begged for a way to supply them without carrying the end parenthesis down multiple lines or letting them build up into parentheses knots that must be carefully counted. One gets to fidgeting with where to put them, and sometimes there’s no good answer to that.

Another feature we need to make this declarative notation ability robust is inline variable declaration and assignment. Notice in the last example how several intermediary structures have variable names defined for them ahead of time, outside the expression. Writing that Future code, I felt it was unfortunate these variables couldn’t be defined inline as part of the expression. Doing so would allow us to define any kind of structure we might see in XML or JSON, such as this XAML UI code.

new Canvas -> LayoutRoot

{

Height = Auto,

Width = Auto,

new StackPanel -> sp

{

Orientation = Vertical,

Height = 150,

Width = Auto,

Canvas.Top = 10,

Canvas.Left = 20,

with Canvas

{

Top = 10,

Left = 20,

with Canvas { Top = 10, Left = 20 },

Loaded += (sender, e)

{

Debug.WriteLine("StackPanel sp.Loaded running");

sp.ResizeTo(0.5 seconds, Auto, 200).Begin();

LayoutUpdated += HandleLayoutUpdated,

new TextBlock

{

FontSize = 18,

Text = "Title"

new TextBlock { Text = "Paragraph 1" },

new TextBlock { Text = "Paragraph 2" },

new TextBlock(Text = "Paragraph 3")

}

};

A few notes are needed here:

· Wow, this looks a lot like XAML, but much friendlier to developers who have to actually read and edit it! Yes, good observation.

· Unlike XAML, every identifier here works with the all-important Rename refactoring, go to definition, find all references to, etc. This is great for reducing the amount of work to find relationships among things and manually update related files.

· Also unlike XAML, code for event handlers can be defined here. I’m not saying you should cram all of your event handler logic here, but it could come in quite handy at times and I can’t see any reason to disable it.

· The with token is a custom operator (see Part 7) that provides access to attached properties through an initializer body. Custom extensions allow you to access these properties with a natural member-access style.

· It hasn’t been possible to use generic classes in XAML. Specifying UI in Archetype, this would be trivial, and I suspect they could be used to good effect in many ways. Of course, in doing this you’d lose support for the designers in VS and Blend, which would be awfui.

· Auto is simply an alias for double.NaN.

· The -> custom operator in these expressions defines a variable and sets it to the value of the new object. The order of execution is:

1. Evaluate constructor parameters, if any are supplied.

2. Assign the object to the variable defined with ->, if supplied.

3. Set any fields or properties with assignment statements.

4. Evaluate value expressions, if supplied, and call the class’s value function with each one, if a value function has been defined.

5. Invoke any matching value function defined in class extensions.

By following this design, the example above can be translated into this C# code by the Archetype compiler:

var LayoutRoot = new Canvas()

{

Height = double.NaN,

Width = double.NaN

};

var sp = new StackPanel()

{

Orientation = Orientation.Vertical,

Height = 150.0,

Width = double.NaN

};

LayoutRoot.Children.Add(sp);

sp.SetValue(Canvas.TopProperty, 10.0);

sp.SetValue(Canvas.LeftProperty, 20.0);

sp.Loaded += (sender, e)

{

Debug.WriteLine("StackPanel sp.Loaded running");

sp.ResizeTo(0.5.seconds(), double.NaN, 200.0).Begin();

sp.LayoutUpdated += HandleLayoutUpdated;

sp.Children.Add(new TextBlock() { FontSize = 18, Text = "Title"});

sp.Children.Add(new TextBlock() { Text = "Paragraph 1"});

sp.Children.Add(new TextBlock() { Text = "Paragraph 2"});

sp.Children.Add(new TextBlock() { Text = "Paragraph 3"});

var VisualTree = LayoutRoot;

Compare the two approaches. The C# code is a typical example of imperative structure building, while the Archetype code is arguably as declarative as XAML, and with many advantages over XAML for developers.

Going back to the Future example, we could rewrite this in Archetype a few different ways. I’ll present two. In the first one, value functions are used to receive the future’s evaluation function as well as any Future objects the expression depends on.

new Future<string>("bracket") -> result

{

() => Bracket(FutureParen),
new Future<string>("parenthesize") -> FutureParen

{

() => Parenthesize(FutureConcat),
new Future<string>("concat") -> FutureConcat

{

() => FuturePi + " < " + FutureOmega,
new Future<string>("pi") -> FuturePi

{

() => CalculatePi(10)

new Future<string>("omega") -> FutureOmega

{

() => CalculateOmega()

}

The shorter approach passes an evaluation delegate in as a parameter.

new Future<string>(() => Bracket(FutureParen)) -> result

{

new Future<string>(() => Parenthesize(FutureConcat)) -> FutureParen

{

new Future<string>(() => FuturePi + " < " + FutureOmega) -> FutureConcat

{

new Future<string>(() => CalculatePi(10)) -> FuturePi,

new Future<string>(() => CalculateOmega()) -> FutureOmega

}

The name string parameter is missing from the last example. This was only for use during debugging. Now what we have is a very direct description of futures that are dependent on other futures in a dependency graph.

Summary

Object construction is a crucial part of an object-oriented language, and Archetype is advanced with its options for constructing arbitrary object graphs and initializing even complicated state in a single expression. These fluent declarative syntax features are ideal for representing structures such as XAML UI, state machines, dependency graphs, and much more.

XAML is a language. The question this work has me asking is: do we really need a separate language if our general purpose language supports highly declarative syntax? It’s a provocative question without an easy answer, but it seems clear that many DSLs could emerge within a language that so richly supports composition.

With the ability to define arbitrarily complex structures in code—from declarative object graphs to rich functional expressions—it’s hard to think of a situation that would be too difficult to model and build an API or application around.

Posted in Archetype Language, Data Structures, Design Patterns, Language Innovation, Silverlight, User Interface Design, WPF | 2 Comments »

Critical Development

Language design, framework development, UI design, robotics and more.

Categories

Archives

Subscribe

Top Posts

.NET Links

Blogroll

Archive for October, 2010

The Archetype Language (Part 9)

Params & Fluent Syntax

Null Coalescing Operators

Safe Navigation Operator

Summary

The Archetype Language (Part 8)

Constructors

Summary