Critical Development

Language design, framework development, UI design, robotics and more.

Archive for the ‘Development Environment’ Category

Windows Phone 7 – Platform

Posted by Dan Vanderboom on August 20, 2010

A Scorching Hot Market

The smart phone market is the hottest market in the computer world, based on unbelievable growth.  It surpasses the desktop “personal computer”, the PC, in finding its way into the pockets and hands of consumers who might not otherwise buy a larger computer; and in doing so, has established itself as the ultimate personal computer.

It’s like filling a ditch with large boulders until no more will fit, and then filling the remaining space with smaller debris.  Smart phones, and all cell phones to an even larger extent, are that debris.  And in the future, we’ll fill what space is left with granules of sand.

Looking at the Windows Phone from every angle, from features to development patterns, from its role in the market to its potential in peer-to-peer and cloud computing scenarios, overall I have to admit I am very impressed and quite excited.  I also have some harsh criticism; and because of my excitement and optimism, strong hope that these concerns will be addressed as matters of great urgency.  If Microsoft is serious about competing with Android and iPhone, they’ll have to invest heavily in doing what they’ve been so good at: giving developers what they want.

The Consumer Experience: Entertainment

Microsoft’s primary focus is on the consumer experience, and I think that focus is aptly set.  But that’s the goal, and supporting developers is the means to accomplish that.  You can’t separate one from the other.  However, you must always prioritize; and while features like background thread execution has many solid business cases, it has to wait for consumer experience features to be refined and reliability guarantees to be worked out.

What this means is that Microsoft is paying much more attention to Design with a capital D.  They’re taking tight control over the operating system’s shell UI in order to provide consistency across devices and carriers –taking a lesson from Android, which is painfully fragmented—and have announced some exciting new controls such as TiltContentControl, Panorama and Pivot.

By focusing on the consumer, Microsoft is in essence really just reminding themselves not to think like a company for enterprise customers.  Let’s face it: Microsoft is first and foremost an enterprise products company.  But by integrating XNA and Xbox Live into the Windows Phone platform, they’re creating a phone with a ton of gamer–and therefore non-enterprise–buzz.

Entertainment is Windows Phone’s number one priority.  Don’t worry: the enterprise capabilities you want will come: as services in the OS or as apps that Microsoft and third-parties publish.  But those enterprise scenarios will be even more valuable as the entertaining nature of the device puts it into millions and millions of hands.

Introducing Funderbolt Games

To take advantage of this explosion of interest in smart phones and gaming, and having developed mobile device software for the past five years, I’ve recently started a company called Funderbolt Games.  We’re developing games (initially targeting Windows Phone and Android) for children from about one to five years old, and will eventually publish casual games for adults as well.

I’ve been working with a fantastic artist, Shannon Lazcano, in building our first Silverlight game for Windows Phone.  It will be a simple adventure game with lots of places and activities to explore as a family of mice interact with a rabbit, a dog, a frog, some fireflies, dragonflies, and more.  Based on heaps of research of kids games on the iPhone, and having watched young children play most of the games in the App Store over the past two years, the games we’re creating are guaranteed to keep young kids entertained for dozens of hours.

In the next few weeks, I’ll publish more details and some screen shots.  The target for our first title is October 27th (PDC 2010).

The Shell User Interface

The Android operating system’s openness has become one of its weaknesses.  Shell UI replacements like Motorola’s Blur and HTC’s Sense are modifications of the operating system itself.  When a new version of Android is released, these manufacturer’s take their sweet time making the necessary adaptations with their custom front ends, and customers end up several versions behind–much to their chagrin.

It would be much better to write shell replacements as loosely coupled applications or services, and design the OS to make this easy.  I completely support companies innovating in the shell UI space!  We need to encourage more of this, not to lock them out.  These experiments advance the state of the art in user experience design and provide users with more options.  If anything, the only rule coming from the platform should be that nobody shall prevent users from changing shells.

Because the smartphone is such a personal device, it makes sense that different modes of interaction might rely in part on personal taste.  Some users are simply more technical than others (developers), or need to be shielded from content as well as messing things up (children), and there are people with various disabilities to consider as well.  One UI to rule them all can’t possibly be the right approach.

Despite this, it’s really not a shock then that Windows Phone 7 will have a locked down shell UI in its first and probably second versions.  Flexibility and choice sound good in theory, but in reality can create quite a mess when done without careful planning, and the Windows Phone team has a substantial challenge in figuring out how to open things up without producing or encouraging many of the same problems they see elsewhere.  I’ll be eagerly watching this drama unfold over the first year that Windows Phones are in the wild.

That being said, I have to admit I’m not a fan of the Windows Phone 7 Start Menu.  It feels disappointingly flat and imbalanced, wasting a good bit of screen real estate to the right of the tile buttons.  I’m glad they attempted something different, but I think they have a long way to go toward making it that sexy at-a-glance dashboard that would inspire any onlooker to go wide-eyed with envy.

The demo videos of Office applications are likewise drab.  Don’t get me wrong: the fonts, as everyone is quick to point out, are beautiful, and we all love the parallax scrolling effects.  But we’ve come to expect and appreciate some chrome: those color gradients and rounded corners and other interesting geometrical shapes that help to define a sense of visual structure.  As noted by several tweeple, it sometimes looks like a hearkening back to ye olden days of text-only displays.  More beautiful text, but nothing beyond that to suggest that we’ve evolved beyond the printing press.

The one line I keep hearing over and over again is that Microsoft is serious about keeping an aggressive development cycle, revving Windows Phone quickly to catch up to their competitors.  The sting of Windows Mobile’s abysmal failure is still fresh enough to serve as an excellent motivator to make the right decisions and investments, and do well by consumers this time.

A Managed World

There are a few brilliant features of Windows Phone 7.  One of them is the requirement for a hardware “Navigate Back” button.  As a Windows Mobile developer working within the constraints of small screens, I never had enough room, and sacrificing space on nearly every screen for a back button was painful.  Not only did you have to give up space, you also had to make it fit your application’s style.  Windows Forms with the Compact Framework (Rest In Peace) was not fun to work with.

We’re entering a new era.  The Windows Phone will only allow third party applications to be written in managed code.  I couldn’t imagine any better news!  Why?  We don’t have to concern ourselves with PInvoke or trying to interoperate with COM objects.  We also don’t have to worry about memory leaks or buffer overruns: the garbage collection and strong-type system in the CLR takes care of these.  Not only do I not have to worry about these problems as a .NET developer, I also don’t have to worry about my app being negatively impacted by an unmanaged app on the same device, and I don’t have to worry about these problems emerging on my own personal phone despite what apps are installed.

I’ve been plagued by my iPhone lately because of unstable apps, especially since I upgraded to iOS 4.x.  I can’t tell you how many times that an app dies because it’s just so damn frequent, and my frustration level is rising.  The quality of Apple’s own apps are just as bad: not only does iTunes sync not work correctly on a PC, but I can’t listen to my podcasts through the iPod app without finding that I can’t play a podcast I downloaded (but I can stream it), or that playback works for a while before it starts to stutter the audio and locks up my whole phone.  These are signs of a fragile platform and an immature, unmanaged execution environment.

I used to be a technology apologist, but I find myself increasingly critical when it relates to my smart phone.  I demand reliability.  Forcing applications to use managed code is an excellent way to do that.  Once your code is wrapped in such a layer, the wrapper itself can evolve to provide continuously improving reliability and performance.

If you have significant investments in unmanaged code for Windows Mobile devices, it’s too late for sympathy.  The .NET platform is over a decade old, and managed execution environments have clearly been the future path for a long time now.  You’ve had your chance to convert your code many times over.  If you haven’t done so yet and you still think your old algorithms and workflows are valuable, it’s time to get on board and start porting.  Pressuring Microsoft to open up Windows Phone to unmanaged code is a recipe for continued instability and ultimately disaster for the platform.

In fact, I would consider the Windows Phone 7 OS to be one step closer to an OS like Singularity, which is a research operating system written in managed code.  As some pundits have predicted or recommended that Apple extrapolate iOS to the desktop and eventually drop OSX, we might see a merge of desktop and mobile OS technologies at Microsoft, or at least a borrowing of ideas, to move us closer to a Singularity-like OS whose purpose is improving the reliability of personal computing.

A Bazillion Useless Apps

There are a ton of crappy apps in the Apple App Store and the Android Marketplace.  I’d say it’s a pretty heavy majority.  Perhaps they shouldn’t be boasting how many apps they have, as if quality and quantity were somehow related.  I have a feeling that Microsoft’s much stronger developer platform will produce a higher signal-to-noise ratio in their own marketplace, and I’ll explain why.

Web developers spend their time attempting cross-browser and cross-version support, Apple developers spend their time tracking down bugs that prevents even basic functionality from working well, and Android developers spend their time trying to support a fragmented collection of phone form factors, screen sizes, and graphics processors.  I suspect Microsoft developers will find a sweet spot in being able to spend the majority of their time building the actual features that have business or entertainment value.

Why?  There are only two supported screen resolutions for Windows Phone 7, and the nice list of hardware requirements give developers a strong common foundation they can count on (while still leaving room for innovation above and beyond that), providing the same kind of consistent platform that Apple developers enjoy.  In addition, the operating system will be updated over the air, so all connected devices will run the same version.  Without OS modifications like shell UI replacements that can delay that version’s readiness, there’s greater consistency and therefore less friction.

These factors don’t account for problems that can still result from sloppy programming and low standards, but fewer obstacles will remain to building high quality applications.  A rich ecosystem of Silverlight and XNA control libraries, frameworks, and tooling already exists.  With incredible debugging tools that further help to improve quality, my bet is that we’ll see much more focus on valuable feature development.

PDC 2010

I’m one of the lucky 1000 developers going to the Professional Developer Conference this year at Microsoft’s Campus in Redmond, WA from October 27-29.  I’m looking forward to learning more about the platform and hope to get my hands on one.  I’m also working on two Windows Azure projects, a big one for a consulting project and a personal one which will be accessed through a Windows Phone app, so I’m excited to catch some of the Azure sessions and meet their team as well.

If you’re also headed to PDC and are interested in meeting up while in Redmond/Seattle, leave me a comment! I’m always interested to hear and share development and technology ideas.

Posted in Compact Framework, Development Environment, iPhone, PDC10, Silverlight, Windows Phone 7 | 1 Comment »

Reimagining the IDE

Posted by Dan Vanderboom on May 31, 2010

Overview

After working in Visual Studio for the past decade, I’ve accumulated a broad spectrum of ideas on how the experience could be better.  From microscopic features like “I want to filter Intellisense member lists by member type” to recognition of larger patterns of conceptual organization and comprehension, there aren’t many corners of the IDE that couldn’t be improved with additional features—or in some cases—a redesign.

To put things in perspective, consider how the Windows Mobile platform languished for years and become stale (or “good enough”) until the iPhone changed the game and raised the bar on quality to a whole new level.  It wasn’t until fierce competition stole significant market share that Microsoft completely scrapped the Windows Mobile division and started fresh with a complete redesign called Windows Phone 7.  This is one of the smartest things Microsoft has done in a long time.

After many years of incremental evolution, it’s often necessary to rethink, reimagine, and occassionally even start from scratch in order to make the next revolutionary jump forward.

Visual Studio Focus

Integrated Development Environments have been with us for at least the past decade.  Whether you work in Visual Studio, Eclipse, NetBeans, or another tool, there is tremendous overlap in the set of panels available, the flexible layout of those panels, saved workspaces, and add-in infrastructure to make as much as possible extensible.  I’ll focus on Visual Studio for my examples and explanations since that’s the IDE I’m most familiar with, but there are parallels to other IDEs for much of what I’m going to cover.

Visual Components & Flexible Layout

Visual layout is one thing that IDEs do right.  Instead of a monolithic UI, it’s broken down into individual components such as panels, toolbars, toolboxes, main menus and context menus, code editors, designers, and more.  These components can be laid out at runtime with intuitive drag-and-drop operations that visually suggest the end result.

The panels of an IDE can be docked to any edge of another panel, they can be laid on top of another panel to create tab controls, and adjacent panels can be relatively resized with splitters that appear between panels.  After many years of refinement, it’s hard to imagine a better layout system than this.

The ability to save layouts as workspaces in Expression Blend is a particularly nice feature.  It would be nicer still if the user could define triggers for these workspaces, such as “change layout to the UI Designer workspace when the XAML or Windows Forms designers are opened”.

IDE Hosting

Visual Studio and other development tools have traditionally been desktop applications.  In Silverlight 4, however, we now have a framework sufficiently powerful to build a respectable cross-platform IDE.

With features such as off-line, out-of-browser execution, full screen mode, custom context menus, and trusted access to the local file system, it’s now possible for a great IDE to be built and run on Windows, Mac OS X, or Linux, and to allow a developer to access the IDE and their solutions from any computer with a browser (and the Silverlight plug-in).

There are already programming editors and compilers in the cloud.  In episode 562 of .NET Rocks on teaching programming to kids, their guests point out that a subset of the Small Basic IDE is available in Silverlight.  For those looking to build programming editors, ActiPro has a SyntaxEditor control in WPF that they’re currently porting to Silverlight (for which they report seeing a lot of demand).

Ideally such an IDE would be free, or would have a free version available, but for those of us who need high-end tools and professional-level features sets, imagine how nice it would be to pay a monthly fee for access to an ever-evolving IDE service instead of having to cough up $1,100 or $5,500 (or more) every couple years.  Not only would costs be conveniently amortized over the span of the tool’s use, but all of your personal preferences would be easily synchronized across all computers that you use to work on that IDE.

With cloud computing services such as Windows Azure, it would even be possible to off-load compilation of large solutions to the cloud.  Builds that took 30 minutes could be cut down to a few minutes or less by parallelizing build tasks across mutliple cores and servers.

The era of cloud development tools is upon us.

Solution Explorer & The Project System

Solution Explorer is one of the most useful and important panels in Visual Studio.  It provides us with an organizational tool for all the assets in our solution, and provides a window into the project system on which core behaviors such as builds are based.  It is through the Solution Explorer that we typically add or remove files, and gain access to visual designers and the ever-present code editor.

In many ways, however, Solution Explorer and the project system it represents are built on an old and tired design that hasn’t evolved much since its introduction over ten years ago.

For example, it still isn’t possible to “add existing folder” and have that folder and all of its contents pulled into a project.  If you’ve ever had to rebuild a project file and pull in a large number of files organized in many nested folders, you have a good idea of how painful an effort this can be.

If you’ve ever tried sharing the same code across multiple incompatible platforms, between Full and Compact Framework, or between Silverlight 3 and Full Framework, you’ve likely run into kludgey workarounds like placing multiple project files in the same folder and including the same set of files, or using a tool like Project Linker.

Reference management can also be unwieldy when you have many projects and references.  How do you ensure you’re not accidentally referencing two different versions of the same assembly from two different projects?  My article on Project Reference Oddness in VS2008, which explores the mysterious and indirect ways references work, is by far one of my most popular articles.  I’m guessing that’s because so many people can relate to the complexity and confusion of managing these dependencies.

“Projects” Are Conceptually Overloaded: Violating the Single Responsibility Principle

In perhaps the most important example, consider how multiple projects are packaged for deployment, such as what happens for the sake of debugging.  Which assemblies and other files are copied to the output directory before the program is executed?  The answer, discussed in my Project Reference Oddness article, is that it depends.  Files that are added to a project as “Content” don’t even become part of the assembly: they’re just passed through as a deployment command.

So what exactly is a Visual Studio “project”?  It’s all of these things:

  • A set of source code files that will get compiled, producing an assembly.
  • A set of files that get embedded in the resulting assembly as resources.
  • A set of deployment commands for loose files.
  • A set of deployment commands for referenced assemblies.

If a Visual Studio project were a class definition, we’d say it violated the Single Responsibility Principle.  It’s trying to be too many things: both a definition for an assembly as well as a set of deployment commands.  It’s this last goal that leads to all the confusion over references and deployment.

Let’s examine the reason for this.

A deployment definition is something that can span not only multiple assemblies, but also additional loose files.  In order to debug my application, I need assemblies A, B, and C, as well as some loose files, to be copied to the output directory.  Because there is no room for the deployment definition in the hierarchy visualized by Solution Explorer, however, I must somehow encode that information within the project definitions themselves.

If assembly A references B, then Visual Studio infers that the output of B needs to be copied to A’s output directory when A is built.  Since B references C, we can infer that the output of C needs to be copied to B’s output directory when B is built.  Indirectly, then, C’s output will get dumped in A’s output directory, along with B’s output.

What you end up with is a pipeline of files that shuffles things along from C to B to A.  Hopefully, if all the reference properties are set correctly, this works as intended and the result is good.  But the logic behind all of this is an implicit black box.  There’s no transparency, so when things get complicated and something goes wrong, it can become impossible to figure it out in a reasonable amount of time (try reading through verbose build output sometime).

At one point, just before writing the article on references mentioned above, I was spending 10 hours or more a week just fighting with reference dependencies.  It was a huge mess, and a very expensive way to accomplish absolutely nothing in terms of providing value to customers.

Deployments & Assemblies

Considering our new perspective on the importance of representing deployments as first-class organizational items in solutions, let’s take a look at what that might look like in an IDE.  Focus on the top-left of the screenshot below.

image

The first level of darker text (“Silverlight Client” and “Cloud Services”) are equivalent to “solution folders” in Visual Studio.  They’re labels that can be nested like folders for organizational purposes.  Within each of these areas is a collection of Deployment definitions.  The expanded deployment is for the Shell of our Silverlight application.  The only child of this deployment is a location.

In a desktop application, you might have multiple deployment locations, such as $AppDir$, $AppDir$\Data, or $UserDir$\AppName, each with child nodes representing content to be deployed to those locations.  In Silverlight, however, it doesn’t make sense to deploy to a specific folder since that’s abstracted away from you.  So for this example, the destination is Shell.XAP.

You’ll notice that multiple assemblies are listed.  If this were a web application, you might have a number of loose files as well, such as default.aspx or web.config.  If such files were listed under that deployment, you could double-click one to open and edit in the editor on the right-hand side of the screen.

The nice thing about this setup is the complete transparency: if a file is listed in a deployment path, you know if will be copied to the output directory before debugging begins.  If it’s not listed, it won’t get deployed.  It’s that simple.

The next question you might have is: doesn’t this mean that I have a lot of extra work to manually add each of these assembly files?  Especially when it comes to including the necessary references, nobody wants the additional burden of having to manually drag every needed reference into a deployment definition.

This is pretty easy to deal with.  When you add a reference to an assembly, and that referenced assembly isn’t in the .NET Framework (those are accessed via the GAC and therefore don’t need to be included), the IDE can add that assembly to the deployment definition for you.  Additionally, it would be helpful if all referenced assemblies lit up (with a secondary highlight color) when a referencing assembly was selected in the list.  That way, you’d be able to quickly figure out why each assembly was included in that deployment.  And if you select an assembly that requires a missing assembly, the name of any missing assemblies should appear in a general status area.

What we end up with is a more explicit and transparent way of dealing with deployment definitions separately from assembly definitions, a clean separation of concepts, and direct control over deployment behavior.  Because deployment intent is specified explicitly, this would be a great starting point for installer technologies to plug into the IDE.

In Visual Studio, a project maps many inputs to many outputs, and confuses deployment and assembly definitions.  A Visual Studio “project” is essentially an “input” concept.  In the approach I’ve outlined here, all definitions are “output” concepts; in other words, items in the proposed solution hierarchy are defined in terms of intended results.  It’s always a good idea to “begin with the end in mind” this way.

Multiple Solution Views

In the screenshot above, you’ll notice there’s a dropdown list called Solution View.  The current view is Deployment; the other option is Assembly.  The reason I’ve included two views is because the same assembly may appear in multiple deployments.  If what you want is a list of unique assemblies, that alternative view should be available.

A New Template System

The other redesign required is around the idea of Visual Studio templates.  Instead of solution, project, and project item templates in Visual Studio, you would have four template types: solution, deployment, assembly, and file.  Consider these examples:

Deployment Template: ASP.NET Web Application

  • $AppDir$
    • Assembly: MyWebApp.dll
      • App.xaml.cs
      • App.xaml    (embedded resource)
      • Main.xaml.cs
      • Main.xaml   (embedded resource)
    • File: Default.aspx
    • File: Web.config
    • Folder: App_Data
      • File: SampleData.dat

Solution Template: Silverlight Solution

  • Deployment: Silverlight Client
    • MySLApp.XAP
      • Assembly: MyClient.dll
        • App.xaml.cs
        • App.xaml    (embedded resource)
        • Main.xaml.cs
        • Main.xaml   (embedded resource)
  • Deployment: ASP.NET Web Application
    • $AppDir$
      • Assembly: MyWebApp.dll
        • YouGetTheIdea.cs
      • Folder: ClientBin
        • MySLApp.XAP (auto-copied from Deployment above)
      • File: Default.aspx
      • File: Web.config

Summary

In this article, we explored several features in modern IDEs (Visual Studio specifically), and some of the ways in which imaginative rethinking could bring substantial improvements to the developer experience.  I have to wonder how quickly a large ship like Visual Studio (with 1.5 million lines of mostly C++ code) could turn and adapt to new ideas like this, or whether it makes sense to start fresh without all the burden of legacy.

Though I have many more ideas to share, especially regarding the build system, multiple-language name resolution and refactoring, and IDE REPL tools, I will save all of that for future articles.

Posted in Cloud Computing, Development Environment, Silverlight, User Interface Design, Visual Studio, Windows Azure | Leave a Comment »

Better Tool Support for .NET

Posted by Dan Vanderboom on September 7, 2009

Productivity Enhancing Tools

Visual Studio has come a long way since its debut in 2002.  With the imminent release of 2010, we’ll see a desperately-needed overhauling of the archaic COM extensibility mechanisms (to support the Managed Package Framework, as well as MEF, the Managed Extensibility Framework) and a redesign of the user interface in WPF that I’ve been pushing for and predicted as inevitable quite some time ago.

For many alpha geeks, the Visual Studio environment has been extended with excellent third-party, productivity-enhancing tools such as CodeRush and Resharper.  I personally feel that the Visual Studio IDE team has been slacking in this area, providing only very weak support for refactorings, code navigation, and better Intellisense.  While I understand their desire to avoid stepping on partners’ toes, this is one area I think makes sense for them to be deeply invested in.  In fact, I think a new charter for a Developer Productivity Team is warranted (or an expansion of their team if it already exists).

It’s unfortunately a minority of .NET developers who know about and use these third-party tools, and the .NET community as a whole would without a doubt be significantly more productive if these tools were installed in the IDE from day one.  It would also help to overcome resistance from development departments in larger organizations that are wary of third-party plug-ins, due perhaps to the unstable nature of many of them.  Microsoft should consider purchasing one or both of them, or paying a licensing fee to include them in every copy of Visual Studio.  Doing so, in my opinion, would make them heroes in the eyes of the overwhelming majority of .NET developers around the world.

It’s not that I mind paying a few hundred dollars for these tools.  Far from it!  The tools pay for themselves very quickly in time saved.  The point is to make them ubiquitous: to make high-productivity coding a standard of .NET development instead of a nice add-on that is only sometimes accepted.

Consider just from the perspective of watching speakers at conferences coding up samples.  How many of them don’t use such a tool in their demonstration simply because they don’t want to confuse their audience with an unfamiliar development interface?  How many more demonstrations could they be completing in the limited time they have available if they felt more comfortable using these tools in front of the masses?  You know you pay good money to attend these conferences.  Wouldn’t you like to cover significantly more ground while you’re there?  This is only likely to happen when the tool’s delivery vehicle is Visual Studio itself.  Damon Payne makes a similar case for the inclusion of the Managed Extensibility Framework in .NET Framework 4.0: build it into the core and people will accept it.

The Gorillas in the Room

CodeRush and Resharper have both received recent mention in the Hanselminutes podcast (episode 196 with Mark Miller) and in the Deep Fried Bytes podcast (episode 35 with Corey Haines).  If you haven’t heard of CodeRush, I recommend watching these videos on their use.

For secondary information on CodeRush, DXCore, and the principles with which they were designed, I recommend these episodes of DotNetRocks:

I don’t mean to be so biased toward CodeRush, but this is the tool I’m personally familiar with, has a broader range of functionality, and it seems to get the majority of press coverage.  However, those who do talk about Resharper do speak highly of it, so I recommend you check out both of them to see which one works best for you.  But above all: go check them out!

Refactor – Rename

Refactoring code is something we should all be doing constantly to avoid the accumulation of technical debt as software projects and the requirements on which they are based evolve.  There are many refactorings in Visual Studio for C#, and many more in third-party tools for several languages, but I’m going to focus here on what I consider to be the most important refactoring of them all: Rename.

Why is Rename so important?  Because it’s so commonly used, and it has such far-reaching effects.  It is frequently the case that we give poor names to identifiers before we clearly understand their role in the “finished” system, and even more frequent that an item’s role changes as the software evolves.  Failure to rename items to accurately reflect their current purpose is a recipe for code rot and greater code maintenance costs, developer confusion, and therefore buggy logic (with its associated support costs).

When I rename an identifier with a refactoring tool, all of the references to that identifier are also updated.  There might be hundreds of references.  In the days before refactoring tools, one would accomplish this with Find-and-Replace, but this is dangerous.  Even with options like “match case” and “match whole word”, it’s easy to rename the wrong identifiers, rename pieces of string literals, and so on; and if you forget to set these options, it’s worse.  You can go through each change individually, but that can take a very long time with hundreds of potential updates and is a far cry from a truly intelligent update.

Ultimately, the intelligence of the Rename refactoring provides safety and confidence for making far-reaching changes, encouraging more aggressive refactoring practices on a more regular basis.

Abolishing Magic Strings

I am intensely passionate about any tool or coding practice that encourages refactoring and better code hygiene.  One example of such a coding practice is the use of lambda expressions to select identifiers instead of using evil “magical strings”.  From my article on dynamically sorting Linq queries, the use of “magic strings” would force me to write something like this to dynamically sort a Linq query:

Customers = Customers.Order("LastName").Order("FirstName", SortDirection.Descending);

The problem here is that “LastName” and “FirstName” are oblivious to the Rename refactoring.  Using the refactoring tool might give me a false sense of security in thinking that all of my references to those two fields have been renamed, leading me to The Pit of Despair.  Instead, I can define a function and use it like the following:

public static IOrderedEnumerable<T> Order<T>(this IEnumerable<T> Source, 
    Expression<Func<T, object>> Selector, SortDirection SortDirection)
{
    return Order(Source, (Selector.Body as MemberExpression).Member.Name, SortDirection);
}

Customers = Customers.Order(c => c.LastName).Order(c => c.FirstName, SortDirection.Descending);

This requires a little understanding of the structure of expressions to implement, but the benefit is huge: I can now use the refactoring tool with much greater confidence that I’m not introducing subtle reference bugs into my code.  For such a simple example, the benefit is dubious, but multiply this by hundreds or thousands of magic string references, and the effort involved in refactoring quickly becomes overwhelming.

Coding in this style is most valuable when it’s a solution-wide convention.  So long as you have code that strays from this design philosophy, you’ll find yourself grumbling and reaching for the inefficient and inelegant Find-and-Replace tool.  The only time it really becomes an issue, then, is when accessing libraries that you have no control over, such as the Linq-to-Entities and the Entity Framework, which makes extensive use of magic strings.  In the case of EF, this is mitigated somewhat by your ability to regenerate the code it uses.  In other libraries, it may be possible to write extension methods like the Order method shown above.

It’s my earnest hope that library and framework authors such as the .NET Framework team will seriously consider alternatives to, and an abolition of, “magic strings” and other coding practices that frustrate otherwise-powerful refactoring tools.

Refactoring Across Languages

A tool is only as valuable as it is practical.  The Rename refactoring is more valuable when coding practices don’t frustrate it, as explained above.  Another barrier to the practical use of this tool is the prevalence of multiple languages within and across projects in a Visual Studio solution.  The definition of a project as a single-language container is dubious when you consider that a C# or VB.NET project may also contain HTML, ASP.NET, XAML, or configuration XML markup.  These are all languages with their own parsers and other language services.

So what happens when identifiers are shared across languages and a Rename refactoring is executed?  It depends on the languages involved, unfortunately.

When refactoring a C# class in Visual Studio, the XAML’s x:Class value is also updated.  What we’re seeing here is cross-language refactoring, but unfortunately it only works in one direction.  There is no refactor command to update the x:Class value from the XAML editor, so manually changing it causes my C# class to become sadly out of sync.  Furthermore, this seems to be XAML specific.  If I refactor the name of an .aspx.cs class, the Inherits attribute of the Page directive in the .aspx file doesn’t update.

How frequent do you think it is that someone would want to change a code-behind file for an ASP.NET page, and yet would not want to change the Inherits attribute?  Probably not very common (okay, probably NEVER).  This is a matter of having sensible defaults.  When you change an identifier name in this way, the development environment does not respond in a sensible way by default, forcing the developer to do extra work and waste time.  This is a failure in UI design for the same reason that Intellisense has been such a resounding success: Intellisense anticipates our needs and works with us; the failure to keep identifiers in sync by default is diametrically opposed to this intelligence.  This represents a fragmented and inconsistent design for an IDE to possess, thus my hope that it will be addressed in the near future.

The problem should be recognized as systemic, however, and addressed in a generalized way.  Making individual improvements in the relationships between pairs of languages has been almost adequate, but I think it would behoove us to take a step back and take a look at the future family of languages supported by the IDE, and the circumstances that will quickly be upon us with Microsoft’s Oslo platform, which enables developers to more easily build tool-supported languages (especially DSLs, Domain Specific Languages). 

Even without Oslo, we have seen a proliferation of languages: IronRuby, IronPython, F#, and the list goes on.  A refactoring tool that is hard-coded for specific languages will be unable to keep pace with the growing family of .NET and markup languages, and certainly unable to deal with the demands of every DSL that emerges in the next few years.  If instead we had a way to identify our code identifiers to the refactoring tool, and indicate how they should be bound to identifiers in other languages in other files, or even other projects or solutions, the tools would be able to make some intelligent decisions without understanding each language ahead of time.  Each language’s language service could supply this information.  For more information on Microsoft Oslo and its relationship to a world of many languages, see my article on Why Oslo Is Important.

Without this cross-language identifier binding feature, we’ll remain in refactoring hell.  I offered a feature suggestion to the Oslo team regarding this multi-master synchronization of a model across languages that was rejected, much to my dismay.  I’m not sure if the Oslo team is the right group to address this, or if it’s more appropriate for the Visual Studio IDE team, so I’m not willing to give up on this yet.

A Default of Refactor-Rename

The next idea I’d like to propose here is that the Rename refactoring is, in fact, a sensible default behavior.  In other words, when I edit an identifier in my code, I more often than not want all of the references to that identifier to change as well.  This is based on my experience in invoking the refactoring explicitly countless times, compared to the relatively few times I want to “break away” that identifier from all the code that references.

Think about it: if you have 150 references to variable Foo, and you change Foo to FooBar, you’re going to have 150 broken references.  Are you going to create a new Foo variable to replace them?  That workflow doesn’t make any sense.  Why not just start editing the identifier and have the references update themselves implicitly?  If you want to be aware of the change, it would be trivial for the IDE to indicate the number of references that were updated behind the scenes.  Then, if for some reason you really did want to break the references, you could explicitly launch a refactoring tool to “break references”, allowing you to edit that identifier definition separately.

The challenge that comes to mind with this default behavior concerns code that spans across solutions that aren’t loaded into the IDE at the same time.  In principle, this could be dealt with by logging the refactoring somewhere accessible to all solutions involved, in a location they can all access and which gets checked into source control.  The next time the other solutions are loaded, the log is loaded and the identifiers are renamed as specified.

Language Property Paths

If you’ve done much development with Silverlight or WPF, you’ve probably run into the PropertyPath class when using data binding or animation.  PropertyPath objects represent a traversal path to a property such as “Company.CompanyName.Text”.  The travesty is that they’re always “magic strings”.

My argument is that the property path is such an important construct that it deserves to be an core part of language syntax instead of just a type in some UI-platform-specific library.  I created a data binding library for Windows Forms for which I created my own property path syntax and type, and there are countless non-UI scenarios in which this construct would also be incredibly useful.

The advantage of having a language like C# understand property path syntax is that you avoid a whole class of problems that developers have used “magic strings” to solve.  The compiler can then make intelligent decisions about the correctness of paths, and errors can be identified very early in the cycle.

Imagine being able to pass property paths to methods or return then from functions as first-class citizens.  Instead of writing this:

Binding NameTextBinding = new Binding("Name") { Source = customer1; }

… we could write something like this, have access to the Rename refactoring, and even get Intellisense support when hitting the dot (.) operator:

Binding NameTextBinding = new Binding(@Customer.Name) { Source = customer1; }

In this code example, I use the fictitious @ operator to inform the compiler that I’m specifying a property path and not trying to reference a static property called Name on the Customer class.

With property paths in the language, we could solve our dynamic Linq sort problem cleanly, without using lambda expressions to hack around the problem:

Customers = Customers.Order(@Customer.LastName).Order(@Customer.FirstName, SortDirection.Descending);

That looks and feels right to me.  How about you?

Summary

There are many factors of developer productivity, and I’ve established refactoring as one of them.  In this article I discussed tooling and coding practices that support or frustrate refactoring.  We took a deep look into the most important refactoring we have at our disposal, Rename, and examined how to get the greatest value out of it in terms of personal habits, as well as long-term tooling vision and language innovation.  I proposed including property paths in language syntax due to its general usefulness and its ability to solve a whole class of problems that have traditionally been solved using problematic “magic strings”.

It gives me hope to see the growing popularity of Fluent Interfaces and the use of lambda expressions to provide coding conventions that can be verified by the compiler, and a growing community of bloggers (such as here and here) writing about the abolition of “magic strings” in their code.  We can only hope that Microsoft program managers, architects, and developers on the Visual Studio and .NET Framework teams are listening.

Posted in Data Binding, Data Structures, Design Patterns, Development Environment, Dynamic Programming, Functional Programming, Language Innovation, LINQ, Oslo, Silverlight, Software Architecture, User Interface Design, Visual Studio, Visual Studio Extensibility, Windows Forms | Leave a Comment »

Oslo: Misconceptions and Fallacies

Posted by Dan Vanderboom on February 1, 2009

In the many conversations and debates I’ve had about Oslo recently–in person, on the phone, through email, on blogs, and in the Oslo forum–I’ve encountered a good amount of resistance to the goals of Oslo.  Some of this is due to misconception and general confusion, some is due to an attachment to one’s current methodology, and some I believe is simply due to a fear of anything new and unknown.  In the course of these conversations, I’ve run across a common set of thoughts or themes which I have attempted to represent faithfully here.

My first article, Why Oslo is Important, attempted to elucidate the high-level collection of concepts, technologies, and tools flying under the Oslo banner as it exists today and how I imagine it evolving in the future.  Though I’ve received a lot of interest and appreciation, it also managed to spark some feedback from those who were still confused or concerned, leading me to believe that I had failed to deliver a fully satisfying explanation.

Understandably, Oslo and its target domain are too large to explain or digest in a single article, even a long one.  It’s also too early in the development cycle to be very specific.  So it’s reasonable to suggest that one can’t be totally satisfied until substantial examples and reference applications are built using the Oslo tools.  Fair enough.

This article isn’t going to provide that reference application.  It has the more modest goal of trying to dispel some of the common misconceptions and fallacies that I’ve encountered, and my responses to them.  In future articles, as I design and develop my newest software system, I’ll be documenting and publishing the how and why of my use of Oslo tools and technologies to provide more specific evidence of its usefulness.  I’ll also be providing references to much of the good work that is being done to provide solutions to various problems.

As always, I encourage you to participate and provide feedback: to me, and especially to the Oslo team.  The more brain power we can bring to bear on this problem in the community, the better off the Oslo team will be, and the faster Oslo will evolve to become precisely the set of tools we need to improve our overall development experience, and developer productivity in particular.

“Oslo doesn’t solve any problems that can’t already be solved with existing tools or technologies.”

When the first steam-powered tractors were sold to farmers in the 1860’s, traditional ox and horse farmers might have said the same thing.  “This tractor won’t do anything that my ox-plowing method can’t already do.”  This is true, but it’s not an effective argument against the use of the new technology, which was a much faster and more cost-effective method of farming.  The same farmer could harvest much more of his crop in the same amount of time, and as the new technology matured and gas-powered engines became available (in the 1880’s), so did the benefit increase.  The same goes for any high level language above assembly language, the use of a relational database over a loose collection of files, display of text on a CRT instead of punched tape to communicate with a user, or any other great technological leap forward that “doesn’t accomplish anything new”.

Then again, it all depends what kind of problems you’re talking about.  If you get specific enough, I’m sure you’ll find plenty in Oslo that’s new, even so early in its development, such as a shared repository for interoperable models and the ability to define new parsers that provide tooling support such as keyword colorization.  Sometimes it’s these little details that act as the glue to pull components together and create substantial value through the synergy that results.  Visual Studio and Intellisense weren’t strictly needed (you can still use Notepad and cs.exe), but it can quickly answer dozens of questions a day without having to jump out of context and spend a lot of time looking through disconnected documentation.

“We don’t need to know about Oslo or model-driven development because the applications I build are small and specific, or otherwise don’t need to be so general and flexible.”

This may or may not be true for you, but saying that the industry doesn’t need to advance because you don’t perceive a direct benefit to your own development isn’t valid.  The reason your applications are able to be so simple is because of the wealth of tools, languages, platforms, frameworks, and libraries that your applications leverage.  Standing on the shoulder of giants, you might say.

Many of these systems and components can benefit tremendously from a model-driven approach, and if it improves productivity for Microsoft and other third-party developers, that means they’ll be able to spend less time on plumbing and more time building the framework features you care about.  It’s also likely to make all of those APIs cleaner and more consistent, resulting in easier discoverability and fewer headaches for you, the API consumer.  As the .NET Framework and other frameworks and libraries grow ever larger, this will be critical to keeping things organized and under control.

“Oslo is going to force me to model all kinds of things that really aren’t needed in my software.”

The existence of Oslo will not force you to model any characteristics that you aren’t already modeling through other means.  What it will do is provide more options and tools for modeling your software more effectively and more productively.  It will also significantly ease the burden of creating more heavily model-driven software if you decide that’s right for your application or service.

For more information and a clearer definition of what a model is, see this article on the MSDN Oslo Development Center.

“Oslo will impose a workflow on me that doesn’t make sense for my methodology or business.”

Where Oslo fits into your specific workflow will be ultimately determined by you.  This isn’t any different from Entity Framework.  In v1 of EF, the tooling supports the reading of database structure and the generation of entity classes, but there is work being done to support a workflow going in the other direction: that of starting with classes and generating the database.  The Entity Framework itself doesn’t actually care in which direction you want to work; the issue is primarily one of tool support.  Other initiatives such as adding support for POCO indicate that the EF team is listening to feedback from the community and making the necessary changes to achieve broad support of their framework.  I would expect the same from the Oslo team.

Early releases of Oslo will have similar limitations; currently it seems that M can only be used to generate a database structure from MSchema, and that database structure can be read by Entity Framework to generate your entity classes.  Because Microsoft has such a broad audience to satisfy, other workflows will have to be accommodated, such as starting with class files and generating M files and database schemas.  In fact, I’ve submitted feedback to Microsoft’s Connect site to ensure this kind of multi-master synchronization of model representations is considered.

“Putting everything in a database is overkill for my application, so Oslo isn’t relevant to me.”

While the Repository is an important aspect of Oslo, it isn’t required.  Command line tools exist to transform textual input (specified as MGraph, or in your own custom-defined format using a Domain Specific Language) into MGraph output.  There is a separate step to convert this into SQL, or to optionally inject this data directly into the Oslo Repository.

If you don’t want to use the Repository, there are already multiple methods available for instantiating objects directly from this text data, whether it’s read from a file, embedded as a resource, or sent as data over a network.  Josh Williams (SpankyJ) has published an example showing how to convert DSL text into XAML, and instantiate the object graph using an MGraphXamlReader, and Torkel Ödegaard of Coding Instinct wrote an article demonstrating how to write a generic deserializer without using XAML.

Model formats such as CSDL, MSL, and SSDL for EF, or configuration data currently specified for WCF and WF, will all very likely be expressed in some DSL specified with M (there has been talk about these efforts already).  Since applications without database access will need these technologies, it will be impossible to force developers to read this model data from a SQL Server database.

“We already have XML, XSD, and XSLT, so there’s no benefit to having yet another language to specify the same things.”

XSD is used to define formats and languages (such as XAML), and XML is used as a poor man’s one-size-fits-all meta-format for specifying hierarchical data.  While XML is friendly enough to open in text editors, it’s designed more for tools than for human eyes.

Having different languages and formats to represent different kinds of data actually eases human comprehension and authoring ability.  As Chris Anderson said in his Oslo session at PDC, when you’re looking at XML in an editor, what stands out are not what’s important to your domain, but rather what’s important to XML: elements and attributes.

People are using XML to define their DSLs and formats, not because XML is the best representation, but because writing parsers for new formats and languages is just too hard.  Customers had been asking Microsoft for the ability to write these DSLs easily, so it was out of conversations and customer feedback that Microsoft decided to expose these services.

So it’s not a matter of absolutely needing M and the ability to define new languages because of some inability to get work done without them.  Rather, it’s about reducing the amount of mental work required to author our models and increasing our productivity as a result.  It’s also about having powerful transformational tools available to convert all formats and languages into a common representation so that the models can all interoperate despite their differences, in the same way that .NET languages all compile to a common CIL/MSIL language so that many different programming languages can interoperate.  Without this ability, we’d have a different community of developers for each language instead of one broad group of “.NET developers” who can all share and benefit from each other’s knowledge and libraries.  This has been recognized as such an important advantage that there are efforts underway to compile languages other than Java to JVM byte cote.

The larger the community, the larger our collective pool of knowledge, and the greater reuse we actually achieve.

“Oslo is too general and abstract to be useful to real developers building real systems.”

The idea that generalization can get out of control for a specific problem is valid, in the same way that a problem can be over-analyzed.  But that doesn’t mean that we should stigmatize all general-purpose software, or that we should ignore the growing trend for enterprise software systems to require greater flexibility, user customization support, extensibility, and so on.

The fact is that life on Earth evolves towards greater complexity, and as supporting hardware resources increase and business demands grow, so does software.  Taming that complexity will require rethinking how we approach every aspect of software design and development, including how to model it.

The software development industry is stratified into many layers, from platform development to one-off, command-line utilities.  Some organizations write software to support millions of users, while others deploy specialized applications in-house, but most of us fall somewhere in between.  Oslo seems to be most applicable to enterprise software offered to many customers, including cloud services, but there are subsets of Oslo that will have an impact on a great majority of .NET developers sooner or later.

There’s a lot of thought and work that goes on in our world (and billions of dollars spent) on “pure research” in the sciences (including computer science) that isn’t directly applicable to every John Doe, but without which we wouldn’t have things like nuclear power plants, microwaves, radio, television, satellite communication, or many pharmaceuticals.  The Nobel laureates of the world who have spent their lives studying something so abstract and remote from every day life have contributed massively to the technological progress of our world, and quite often contribute to a better, more sanitary, healthy, and productive society.  Despite the risks and dangers each technology enables, we somehow still make steady progress in terms of reducing chaos and violence.

Without abstract and general technology like general purpose language compilers, which can specify any logic we dream up for any type of application we care to build, we’d be back in the stone age.  The Internet itself is based on communication standards that are so general, they are applicable to any application protocol or service traffic we can devise.

So before dismissing software (or any technology) due to its abstract or general nature, think about where we’d be without them.  Someone has to approach the colossal, abstract, general problems with enough foresight to deliver solutions before they’re too desperately needed; and who better than a huge organization like Microsoft with deep pockets?

Ironically, our ability to define Domain Specific Languages with Oslo give us the converse power: the ability to define languages and formats that are extremely specific to our purposes and problem domains, and therefore enable us to write our specifications with less ceremony and noise that accompanies a general purpose language.  This allows us to specify our intentions more easily by saying only what we need to say to get the point across.  So the only reason Oslo must be so general is to provide that interoperability and translation layer across a set of specific formats that we define… without us having to work so hard for it.

For different reasons, it reminds me of generics, another general and abstract tool: it’s a complicated feature to implement in a language, but it provides great expressive power.  Generics also don’t provide anything we couldn’t manage to do before in other ways, but I certainly wouldn’t want to go back to programming without them!  In fact, you might say it’s an effective modeling tool.

Posted in Development Environment, Dynamic Programming, Language Innovation, Metaprogramming, Oslo, Problem Modeling | 7 Comments »

Alienware M17: Ninja Laptop

Posted by Dan Vanderboom on January 27, 2009

My new laptop, an Alienware M17, arrived earlier this morning.  It’s almost fully loaded, sans dual video cards and dual hard drives (after changing my mind last minute).  First impressions?  In stunning matte black, with its ribbed Skull Cap cover design, a back-lit keyboard, and a soft fingerprint-proof and scratch-resistant surface, it’s absolutely gorgeous!  With the keys glowing red, it makes me want to do my programming in the dark.

See for yourself, though I have to say, it’s even sexier in person.

image image

The only thing that confused me was the pair of mouse buttons, which aren’t separated by any space or visual cue.  When I first saw it, I thought it was some kind of touch-sensitive slider bar.  Then I was afraid they’d given me some kind of Mac mouse, but once I figured out they were separate areas to press for left and right buttons, I was enormously relieved.

I’ve wanted an Alienware ever since I first saw their high-end configurations and sleek designs, and now that they’re owned by Dell, they have the same warranty options for hassle-free, next-day on-site service.  As many problems as I’ve had with Dell hardware, there’s nothing like the peace of mind of knowing that it’ll be taken care of immediately.

The shopping experience was almost perfect.  One minor flaw: their website shows order tracking before it gets shipped out, and after reaching a certain phase of the process (order confirmation, billing, pre-production, etc.), it kept going back to phase 1, Order Confirmation.  I watched it jump several times from being almost ready to ship, back to order confirmation, and had to call to confirm that it was their tracking system and not my order that was messed up.

It was shipped through FedEx, and I missed the delivery by twenty minutes.  On a Saturday.  For some reason, FedEx doesn’t deliver on Sunday or Monday, at least not to my house.  I called to see if I could meet the truck to pick it up, and the dispatcher promised to send the message out to the truck, but I never got a call back.  Not a big deal to wait a few extra days, but you can imagine by excitement, and then my frustration.  To make matters worse, FedEx’s online package tracking sucks.  It’s not real time.  By the time they tried delivering it, I had just seen it show up as leaving its previous stopping point (in another state).  I thought these carriers knew exactly where each package was at all times!  If so, this information does not make it to their website in a timely fashion.

At 3.06 GHz, with 4 GB of DDR3 1064 MHz RAM, and an ATI Mobility Radeon video card with 512 MB RAM (for a software engineer, not a gamer), this machine hit 5.6 on the Windows Vista performance index.  This is even better than the 5.3 that my Bad Ass Development Rig scored, although it’s not a fair comparison (and the Vista performance index isn’t a real measurement of performance anyway).

After building my desktop, I learned that it would cost me $200 or so to publish the results of the PCMark performance tests online.  So if you’re curious to know what my desktop or this laptop scored, feel free to leave a comment (and your email, which isn’t shared), and I’ll be happy to share that privately.

This machine seems to be all about the nice little touches, not unlike the subtle details of a luxury automobile: the soft black finish of the case, a plethora of ports (USB, Firewire, Coaxial, SATA, HDMI, etc.), a 2 megapixel camera built into the lid that can pivot to aim higher or lower, the touch sensitive media control bar at the top of the keyboard, the keyboard’s smooth feel, and so on.

I was expecting it to be extremely heavy, and by laptop standards I’m sure it is (with its 17 inch monitor), but as I hefted the package into the house, I was surprised by how light it felt, so it’s still extremely mobile.  The power brick, on the other hand, is truly a monster, but will be stuffed lovingly anyway into my backpack wherever I go.  It will have to go with me, since my expected battery life is only two hours.

So if you have $3,300 burning a hole in your pocket and need a blazing fast mobile monster of a machine, I highly recommend the Alienware M17.  If not, they do have cheaper configurations starting at around $1,800.

Posted in Development Environment, Hardware, Personal | 6 Comments »

Why Oslo is Important

Posted by Dan Vanderboom on January 17, 2009

imageContrary to common misunderstanding and speculation, the point of Oslo is not to put programming in the hands of business analysts who want to write their own business rules.  Do I think some of that will happen?  Architects and engineers will try everything they can imagine.  Some of them will succeed in specific niches or scenarios, but it won’t replace application or system design, and it will probably be very limited for the forseeable future.  Oslo is more about dramatically improving the productivity of designers and developers by generalizing common solution patterns and generating more adaptable tools.

PDC Keynote

Much of the confusion around Oslo occurs for two reasons:

  1. Oslo is designed at a higher level of abstraction than most systems today, so its scope is broad and it will have an impact on virtually every product, solution and service across Microsoft.  It’s difficult to get your head around something that big.
  2. Because of its abstract nature, core concepts are defined in terms that are heavily overloaded, like "Model", "Repository", and "Language".  Once you’ve picked up the lingo and can translate Oslo terminology into language you’re already familiar with, both the concept and magnitude of it will become obvious.

Oslo isn’t something completely new; in fact, Oslo borrows from a lot of previous research and even existing model-driven development tools.  Oslo focuses existing technologies and techniques into a coherent and mature vision of development, combining all parts into a more powerful whole, and promises to deliver a supremely adaptable and efficient platform to develop on.

What Is Oslo?

Oslo is a software factory for generating first-class, tool-supported languages out of your declarative specifications.

A factory is a highly organized production facility
that produces members of a product line
using standardized parts, tools and production processes.

-from a review of Software Factories

The product line is analogous to Oslo’s parsers, transform tools, and IDE plugins for new data models and languages (both textual and visual) that you define.  The standardized parts are Oslo’s library components; the tools are the M languages and the Quadrant/Intellipad application; and the processes are shaped by the flow of data through the Oslo tool chain (see the diagram near the end of this article).

With Oslo, you build the custom tools you need to rapidly build or generate software systems.  It’s all about using the right tool for the job, and having a say in how those tools are shaped to obtain the greatest leverage.

As stated at the home page of softwarefactories.com:

We see a capacity crisis looming. The industry continues to hand-stitch applications distributed over multiple platforms housed by multiple businesses located around the planet, automating business processes like health insurance claim processing and international currency arbitrage, using strings, integers and line by line conditional logic. Most developers build every application as though it is the first of its kind anywhere.

In other words, there’s already a huge shortage of experienced, highly-qualified professionals capable of ensuring the success of these increasingly complex systems, and with the need (and complexity) growing exponentially, our current development practices increasingly fall short of the total demand.

Books like Greenfield’s Software Factories have been advocating building at a higher level of abstraction for years, and my initial reaction was to see it as a natural, evolutionary milestone for a highly mature software system.  However, it’s an awful lot of focused development effort to attain such a level of maturity, and not many organizations are able to pull it off given the state of our current development platforms.

It’s therefore fortuitous that Microsoft teams have taken up the challenge of building these abilities into their .NET platform.  After all, that’s where it really belongs: in the framework.

Unexpected Awesomeness

Oslo of course contains a lot of expected awesomeness, but where it will probably have the most impact in terms of developer productivity is with new first-class languages and language tools.  Why?  It first helps to understand the world of data formats and languages.

We’ve had an explosion of data formats–these mini Domain Specific Languages, if you will (especially in the form of complex configuration files).  As systems evolve and scale, and the ways we can configure and compose our application’s behavior continues to grow, at what point do we perceive that configuration graph as the rich language that it becomes?  Or when our user interfaces evolve from Monolithic to Modular to Composite to Granular Composite (or User Composable), at what point does that persistent object graph become our UX DSL (as with XAML in WPF).

Sometimes we set our standards too low, or are slow to raise them when the time has come to do so.  With XML we get extensibility in defining languages and we think, "If we can parse it, then we can build a tool over it."  I don’t know about you, but I’d much rather work with rich client software–some kind of designer–over a textual data format any day.

But you know how things go: some company like Microsoft builds a whole bunch of cool stuff, driven off some XML configuration, or they unleash something like XAML on which WPF, WF, and more are built.  XAML is great for tools to read and write, and although XML and XAML are textual and not binary and therefore human readable in a text editor (the original intention behind that term), it’s simply not as easy to read as C# or VB.NET.  That’s why we aren’t all rushing to program everything in XAML.

Companies like Microsoft, building from the bottom up, release their platforms well in advance of the thick client user experiences that make them enjoyable to use and which encourages mass adoption.  Their models, frameworks, and applications are so large now that they’re released in massively differentiated stages, producing a technology adoption gap.

By giving that language a syntax other than XML, however, we can approach it in the same way we approach our program logic: in the most human readable and aesthetically-pleasant way we can devise, resembling our programming languages of choice.

Sometimes, the density of data and its structure in our model is such that a visual editor fails to represent that model well.  Source code is a case in point.  You could create a visual designer to visualize flow control, branching logic, and even complex expression building (like the iTunes Smart Playlist), but code in text format is more appropriate in this kind of scenario, and ends up being more efficient with the existing tooling available.  Especially with an IDE like Visual Studio, we’re working with human-millenia of effort that have gone into the great code editing tools we use today.  Oslo respects this need for choice by offering support for building both visual and textual DSLs, and recognizes the fluent definition of new formats and languages as the bridge to the next quantum leap in productivity.

If we had an easy way of defining languages in formats that we developers felt comfortable working with–as we’re comfortable with our general purpose languages and their rich tool support–then we’d be much more productive in the transition between a technology first being released and later having rich tool support over it.  WPF has taken quite a while to be adopted as much as it has, partly due to tool availability and maturity.  Before Expression Blend or Cider designers were released and hand-coding XAML was the only way, those who braved the angle brackets struggled with it.  As I play with Silverlight, I realize how much must still be done in XAML, and how we still struggle.  It’s simply not as nice to work with as my C# code.  Not as rich, and not as strongly tool-supported.

That’s one place Oslo provides value.  With the ability to define new textual and visual DSLs, rigorous verification and validation in a rich set of tools, the promise of Intellisense, colorization of keywords, operators, constants, and more, the Oslo architects recognize the ability to enhance our development experience in a language-agnostic way, raising the level of abstraction because, as they say, the way to solve any technical problem is to approach it at one higher level of indirection.  Unfortunately, this makes Oslo so generalized and abstract that it’s difficult to grasp and therefore to appreciate its immensity.  Once you can take a step back and see how it fits in holistically, you’ll see that it has the potential to dramatically transform the landscape of software development.

Currently, it’s a lot of work to implement all the language services in Visual Studio to give them as rich an experience as we’ve come to expect with C#, VB.NET, and others.  This is a serious impediment to doing this kind of work, so solving the problem at the level of Oslo drastically lowers the barrier to entry for implementing tool-supported languages.  The Oslo bits I’ve seen and played with are very early in the lifecycle for this massive scope of technology, but the more I think about its potential, the more impressed I am with the fundamental concept.  As Chris Anderson explained in his PDC session on MGrammar, MGrammar was an implementation detail, but sometime around June 2007, that feature team realized just how much customers wanted direct access to it and decided to release MGrammar to the world.

Modeling & The Repository

That’s all well and good for DSLs and language enthusiasts/geeks, but primarily perhaps, Oslo is about the creation, exploration, relation, and execution of models in an interoperable way.  In other words, all of the models that are currently used to describe a software system, or an entire IT environment, are either not encoded formally enough to verify or execute, or they’re encoded or stored in proprietary ways that don’t allow interoperability with other models.  A diagram in Visio or PowerPoint documenting network topology, for example, knows nothing about the component architecture or deployment model of the software systems installed and running on that network.

When people usually talk about models, they imagine high-level architecture documents, overviews used to visually summarize work that is much more granular in nature.  These models aren’t detailed, and they normally aren’t kept up to date and in sync with the current design as changes are made.  But modeling in Oslo is not an attempt to make these visual models contain all of the necessary detail, or to develop software with visual tools exclusively.  Oslo simply provides the tools, both graphical and textual, to define and relate many models.  It will be up to the development community to decide how all these tools are ultimately used, which parts of our systems will be specified in a mix of general purpose, domain specific, and visual languages.  Ultimately, Oslo will provide the material and glue to fill the gaps between the high and low level specifications, and unite them into a common, connected, and much more useful set of data.

To grasp what Oslo modeling is really all about requires that we expand our definition of "model", to see the models expressed in our configuration and XAML files, in our applications’ database schemas, in our entity classes, and so on.  As software grows in complexity and becomes more composable, we can use various languages to model its behavior, store that in the repository for runtime execution, inspection, or reuse by other systems.

This funny and clever Oslo video (reminiscent of The Hitchhiker’s Guide to the Galaxy) explains modeling in the broader sense alluded to here.

If we had some universal container for the storage of all different kinds of models, and a standardized way of relating entities across models, we’d be able to do things like impact analysis, where we could see the effect on software systems if someone were to alter the network it was running on; or powerful data mining on the IT execution environment of a business.

Many different tools, with different audiences, will be able to connect into this repository to manipulate aspects of the models that they understand and have access to.  This is just the tip of the iceberg.  We already model so much of what we do in the IT and software worlds, and as we begin adopting business process middleware and orchestration software like BizTalk, there’s a huge amount of value in those models converging and connecting.  That’s where the Oslo Repository comes in.

Oslo provides interoperability among models in the same way that SOA provides interoperability among services.  Not unlike the interoperability we have now among many different languages all sharing the same CLR specification.

Bridging data models across repositories or in shared repository is a major step forward.  With Windows Azure and Microsoft’s commitment to their online services platform (and considering the momentum of the SaaS movement with Amazon, Google, and others), shared storage and data sets are the future.  (Check out SQL Data Services if you haven’t already, and watch for some exciting announcements coming later this year!)

The Dichotomy of Data vs. Metadata

Jeff Pinkston from the Oslo team aptly reflects the attitude of the group when he scoffs at the categorical difference between data and metadata.  In terms of storing and querying it, serializing and communicating it, and everything else that matters in enterprise software, data is data and there’s no reason not to treat it the same when it comes to architecting a system.  We have our primary models and our secondary models, our shared models and our protected models, but they’re still just models that shape our software’s behavior, and they share all of the same characteristics when it comes to manipulation and access.  It’s their ultimate effect that differs.

It’s worth noting, I think, the line that’s been drawn between code and data in some programming languages and not in others (C# vs. LISP).  A division has been made for the sake of security rather than necessity.  Machine instruction codes are represented in the same sort of binary data and realized in the same digital circuitry as traditional user data.  It’s tempting to keep things locked down and divided, but as languages evolve to become more late bound and dynamic (and as the tools evolve to make this feasible), there will be more need for the manipulation of expression trees and ASTs.  I strongly suspect the lines will blur until they disappear.

Schema and Object Instance Languages

In order to define models, we need a tool.  In Oslo, this is a textual language called MShema and an editor called Intellipad.  I personally think it’s odd to talk people’s ears off about "model, model, model", and then to use the synonym "schema" to name the language, but all of these names could change before they’re shipped for all we know.

This is a simple example of an MSchema document:

module MyModel
{
    type Person
    {
        LastName : Text;
        FirstName : Text;
    }

    People : Person*;
}

By running this through the "M Compiler", a SQL script is generated that will create the appropriate database objects.  Intellipad is able to verify the correctness of your schema, and what’s really nice is that you don’t even have to specify data types when you start sketching out your model.  Defaults are assumed, and you can get more specific as your model evolves.

MGraph is a language for defining instances of objects, constrained by an MSchema and similar in format.  So MSchema is to MGraph what XSD is to XML.

In this article, Lars Corneliussen explains Microsoft’s vision to make MGraph as common as XML is today.  Take a look at his article to see a side-by-side comparison of the same object represented as XML (POX), JSON, and MGraph, and decide for yourself which you like best (or see below).

MSchema and MGraph are easier and more efficient to read and write than XML.  Their message format resembles typical structured programming languages, and developers are already familiar with these formats.  XML is a fine format for a tool; it’s human readable but not human-friendly.  A C-style language, on the other hand, is much more human-friendly than all of the angle brackets and the redundancy (and verbosity) of tag text.  That narrows down our choice to JSON and MGraph.

In JSON, the property/field/attribute names are delimited by quotation marks, suggesting that the whole structure is a dumb property bag.

{
    "LastName" : "Vanderboom",
    "FirstName" : "Dan"
}

MGraph has a very similar syntax, but its attribute property names are recognized and validated by the parser generated from MSchema, so the quotation marks are unnecessary.  It ends up looking more natural, and a little more concise.

{
    LastName : "Vanderboom",
    FirstName : "Dan"
}

Because MGraph is just a message format, and Microsoft’s service offerings already support multiple message formats (SOAP/POX/JSON/etc.), it wouldn’t disrupt any of their architecture to add an MGraph adapter, and I’ll be shocked if I don’t hear about one in their next release.

Meta-Languages and MGrammar

In the same way that Oslo includes a meta-model because it allows us to define models, it also includes a meta-language because it allows us to define languages (as YACC and ANTLR have done).  However, just as Pinkston doesn’t think data and metadata should be treated different, it makes sense to think of a language that defines languages as just another language.  There is something Zen about that, where the tools somehow seem to bend back upon themselves like one of Escher‘s drawings.

DrawingHands

Here is an example language defined by MGrammar in a great article on MSDN called MGrammar in a Nutshell:

module SongSample
{
    language Song
    {
        // Notes
        token Rest = "-";
        token Note = "A".."G";
        token Sharp = "#";
        token Flat = "b";
        token RestOrNote = Rest | Note (Sharp | Flat)?;

        syntax Bar = RestOrNote RestOrNote RestOrNote RestOrNote;
        syntax List(element)
          = e:element => [e]
          | es:List(element) e:element => [valuesof(es), e];

        // One or more bars (recursive technique)
        syntax Bars = bs:List(Bar) => Bars[valuesof(bs)];
        syntax ASong = Music bs:Bars => Song[Bars[valuesof(bs)]];
        syntax Songs = ss:List(ASong) => Songs[valuesof(ss)];

        // Main rule
        syntax Main = Album ss:Songs => Album[ss];

        // Keywords
        syntax Music = "Music";
        syntax Album = "Album";

        // Ignore whitespace
        syntax LF = "\u000A";
        syntax CR = "\u000D";
        syntax Space = "\u0020";

        interleave Whitespace = LF | CR | Space;
    }
}

This is a pretty straight forward way to define a language and generate a parser.  Aside from the obvious keywords to define syntax rules and token patterns (with an alternative and more readable format for regular expressions), the => projection operator allows you to shape the MGraph output according to your needs.

I created two simple languages with MGrammar on the plane trip back to Milwaukee from the PDC in November.  The majority of my time was spent fussing with the editor, Intellipad, and for the last half hour I found it very easy to create a language on the fly, extending and changing it through experimentation quickly and easily.  Projections, which are functional expressions in MGrammar used to shape MGraph output, are the most challenging part.  There are a number of techniques that shape the output graph, so it will be good to see how this is approached in future reference examples.

Surreptitiously announced just before I wrote this, Mike Weinhardt at Microsoft indicated that a gallery of example grammars for MGrammar is being put together, to point to the sample grammars for various languages in addition to grammars that the community develops, and it should be available by the end of this month.  These examples demonstrating how to define languages and write sensible projections, coming from the developers who are putting MGrammar together, will be an invaluable tool for teaching you how to use common patterns (just as 101 LINQ Samples did for LINQ).

As Doug Purdy explained on .NET Rocks: "People who are building a domain specific language, and they don’t want to understand how to build a parser, or they’re not language designers.  Actually, they are language designers.  They design a language, but they actually don’t do the whole thing.  They don’t build a parser.  What they do, they just leverage the XML parser.  And what we’re trying to do is provide a toolset for folks where they don’t have to resort to XML in order to do DSLs."

From the same episode, Don Box said of the DSL session at PDC: "I’ve never seen a session with more geek porn in it."

Don: "It’s like crack for developers.  It’s kind of addictive; it takes over your life."

Doug: "If you want the power of Anders in your hand…"

The Tool Chain

Now that we have a better sense of what’s included in Oslo in terms of languages, editors, and the shared repository, we can look at the relationship among the other pieces, which are manifested in the CTP as a set of command-line tools.  In the future, these will integrate into an IDE, most likely Visual Studio.  (I’d expect Intellipad and Quadrant to merge with Visual Studio, but there’s no guaranty this will happen.)

When you create your model with MSchema, you’ll use m to validate that model and generate a SQL script to create a SQL Server 2008 database schema (yes, it only works right now with SQL Server 2008).  You’ll also use the m command to validate your object graph (written in MGraph) against your schema, and translate that into a set of SQL commands to perform inserts and updates against tables.

With enough models, there’ll be huge value in adding yours to the repository.  If you don’t mind writing MGraph or you generate it automatically with something like an MGraphSerializer class in your code, this may be all you need.

If, on the other hand, you decide you could really benefit by defining your own textual language to use instead of MGraph, you can use MGrammar to define a new language.  This language gets compiled by the mg compiler to create your parser, and the mgx command translates code in your new language into an MGraph, which can then be pulled into your database using m.

This diagram depicts the process:

image

Other than these command-line tools, Quadrant is the highly extensible visual tool for exploring models graphically, and Intellipad is a different face on the same shell for defining DSLs with MGrammar and writing DSL code, as well as writing and verifying MSchema and MGraph code.

We should see fairly soon the convergence of these three languages (MGraph, MSchema, and MGrammar) into a single M language.  This makes sense, since what you want to project in your DSL should be something within your model, verified by your schema.  This may ultimately make these projections much easier to write.

We’ll also see this tool chain absorbed into multiple development environments, eventually with rich binding across multiple representations of our model, although this will take longer in Visual Studio.

Languages and Nested Languages

I looked at some MService examples, and I can understand Damon’s concern that although it’s nice to have "operation" as a keyword in a service-oriented language, with more keywords giving you the ability to specify aspects of each endpoint and the communications patterns required, that enclosing the business logic within that service language is probably not a good idea.  I took this from Dennis van der Stelt’s blog:

service Service
{
  operation PhotoUpload(stream : Stream) : Text
  {
    .PostUriTemplate = "upload";

    index : Text = invoke DateTime.Now.Ticks.ToString();
    filename : Text = "d:\\demo\\photo\\" + index + ".jpg";
    invoke MService.ServiceHelper.StoreInFile(stream, filename);

    return index;
  }
}

Why not?  You’re defining a general purpose language within the curley braces, one capable of defining variables, assigning values, referencing .NET objects, and calling methods.  But why do you want to learn a new language to write services when the language you’re using right now is already supremely capable of that?  Don’t you already know a good syntax for invoking methods (other than "invoke %mehthod%")?  If instead you simply referenced an assembly, type, and method from an MService script, you could externally turn any .NET method with serializable parameters and return value into a service operation by feeding it this kind of file, without having to recompile, and without having to reinvent the wheel.

The possible exception would be if MGrammar adds the ability (as discussed by speakers at the PDC) of supporting multiple layers of enclosing languages within other languages.  In other words, you could use MService to define operations and their attributes using its own syntax, and within the curly braces that follow, use the C# or VB.NET parsers to process the logic with the comprehension of a separate language.  There are some neat possibilities here, but I expect the development community to be conservative and hesitent about mixing layers of semantics, as there is an awful lot of room for confusion and complexity.  It may be better to leave different language blocks in separate files or containers, and to allow them to reference each other as .NET assemblies and XML files reference each other today.

However, I wouldn’t get too hung up on the early versions of these new languages, or any one language specifically.  The useful, sensible ones that take real developer needs into account and provide the most value will be adopted, and many more will quickly fall into disuse.  But the overall pattern will be for the emergence of an amazing amount of leverage in terms of improving human comprehension and taking advantage of our ability to manipulate structured, symbolic object graphs to build and verify software systems.

Resources

After a few months of research and many hours of writing, I don’t feel like I’ve even scratched the surface.  But instead of giving you an absolutely comprehensive picture, I’m going to stop here and continue in future articles.  In the meantime, check out the following resources.

For an overview of the development paradigm, look for information on language-oriented programming, including an article I wrote that alludes to how "we will have to raise the level of abstraction to a point that may be hard for us to imagine with our existing tools and languages" due to the "precipitious growth of software complexity".  The "community of abstractions" is the model in Oslo-speak.

For Microsoft specific content: there were some great sessions at the PDC (watch the recorded videos).  It was covered (with much confusion) on the .NET Rocks! podcast (here and here) as well as on Software Engineering Radio; and there are lots of bloggers talking about their initial experiences with it, such as Shawn Wildermuth, Lars Corneliussen, and of course Chris Sells and Jeff Pinkston.  The most clear and coherent explanation I’ve heard was from an interview with Ron Jacobs and David Chappell (Ron gave the keynote at MSDN Dev Con, hosted the ARCast podcast for years).  MSDN has at least 29 videos on the Oslo Developer Center, where there’s a good amount of information. including a FAQ.  There’s also the online guide for MGrammar, MGrammar in a Nutshell, and the Oslo team blog.

If you’re interested in creating DSLs, make sure to keep a look out for details about the upcoming DSL Developers Conference, which is tentatively planned for April 16-17, immediately following the Lang.NET conference (on general purpose languages) on April 14-16.  I’m hoping to be at both this year.  And in case you haven’t heard, Microsoft is planning another PDC Conference for 2009, the first time ever these conferences have run for two consecutive years!  There will no doubt be much more Oslo news and conference material to cover it at the PDC in November.

Pluralsight, an instructor-led training company, now teaches a two-day "Oslo" Fundamentals course (and Don Box’s blog is hosted there).

The best way to learn about Oslo, however, is to dive in and use it.  That’s what I’m doing with my newest system, which needs to be modeled from scratch.  So if you haven’t done so already, download the Oslo SDK (link updated to January 2009 SDK) and introduce yourself to the future of modeling and development!

[Click here for the next article in this Oslo series, on common misconceptions and fallacies about Oslo.]

Posted in Data Structures, Development Environment, Distributed Architecture, Language Extensions, Language Innovation, Metaprogramming, Oslo, Problem Modeling, Service Oriented Architecture, Software Architecture, SQL Data Services, Visual Studio, Windows Azure | 44 Comments »

Visual Studio Projects on Network Shares

Posted by Dan Vanderboom on November 10, 2008

In setting up virtual machines for development, I’ve repeatedly run into trust issues accessing solutions on network shares.  Many blogs advise using the .NET 1.1 Configuration tool, which is no longer shipped with Visual Studio.  You can still get it by installing the old .NET Framework 1.1 SDK first, and then going through a series of installations to bring your machine up to date with the remaining versions and toolsets.  I went through the process once, and it’s very undesirable, especially if you build or rebuild development machines more often than you’d like to admit.

So in my latest round of setups, I came across Robert McLaws’ article on the proper caspol syntax for establishing Full Trust for a specific network share, based on this Microsoft article whose title is overly specific.  I’ll reiterate that command here for your convenience:

Drive:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\caspol.exe -m -ag 1 -url “file:////\\computername\sharename\*” FullTrust -exclusive on

To Robert’s point, who would have thought to include four forward slashes?

Be aware that you’ll get an access error in Vista with UAC on, unless you run with elevated privileges.

I’ve done this on Windows Vista 32bit and it seems to be working great.  Even better, I don’t need to use a VMWare Virtual Disk (which itself has some kind of trust or compatibility issue with Visual Studio, due to being VMFS instead of NTFS), or a Physical Disk, which prevents snapshots unless you first disconnect the disk.  I talked about these VM setup issues in this article.

Posted in Development Environment, Visual Studio | 3 Comments »

Observations on the Evolution of Software Development

Posted by Dan Vanderboom on September 18, 2008

Neoteny in the Growth of Software Flexibility and Power

Neoteny is a biological phenomenon of an organism’s development observed across multiple generations of a species.  According to Wikipedia, neoteny is “the retention, by adults in a species, of traits previously seen only in juveniles”, and accounts for many evolutionary shifts, including the human brain’s ability to remain elastic and malleable later in life than those of our distant ancestors.

So how does this relate to software?  Software is a great deal like an organic species.  The species emerged (not long ago), incubated in a more or less fragile state for a number of decades, and continues to evolve today.  Each software application or system built is a new member of the species, and over the generations they have become more robust, intelligent, and useful.  We’ve even formed a symbiotic relationship with software.

Consider the fact that software running on computers was at one time compiled to machine language code for a specific processor.  With the invention of platform-independent instruction sets and their associated runtimes performing just-in-time compilation (Java’s JVM and .NET Framework’s CLR), we’ve delayed the actual production of machine language code until it’s actually needed on the target machine.  The compiler produces a slightly more abstract representation of the program logic, and an extra translation step at installation or runtime is needed to complete the process to make the software usable.

With the growing popularity of dynamic languages such as Lisp, Python, and the .NET Framework’s upcoming release of its Dynamic Language Runtime (DLR), we’re taking another step of neoteny.  Instead of a compiler generating instruction byte codes, a “compiler for any dynamic language implemented on top of the DLR has to generate DLR abstract trees, and hand it over to the DLR libraries” (per Wikipedia).  These abstract syntax trees (AST), normally an intermediate artifact created deep within the bowels of a traditional compiler (and eventually discarded), are now persisted as compiler output.

Traits previously seen only in juveniles… now retained by adults.  Not too much of a metaphorical stretch!  The question is: how far can we go?  And I think the answer depends on the ability of hardware to support the additional “just in time” processing that needs to occur, executing more of the compiler’s tail-end tasks within the execution runtime itself, providing programming languages with greater flexibility and power until the compilation stages we currently execute at design-time almost entirely disappear (to be replaced, perhaps, by new pre-processing tasks.)

I remember my Turbo Pascal compiler running on a 33 MHz processor with 1 MB of RAM, and now my cell phone runs at 620 MHz (with a graphics accelerator) and has gigabytes of memory and storage.  And yet with the state of things today, the inclusion of language-specific compilers within the runtime is still quite infeasible.  In the .NET Framework, there are too many potential languages that people might attempt to include in such a runtime: C#, F#, VB, Boo, IronPython, etc.  Trying to cram all of those compilers into a universal runtime that would fit (and perform well) on a cell phone or other mobile device isn’t yet feasible, which is why we have technologies with approaches like System.Reflection.Emit (on the full .NET Framework), and Mono.Cecil (which works on Compact Framework as well).  These work at the platform-independent CIL level, and so can interpret and generate programs generically, interact with each others’ components, and so on.  One metaprogramming mechanism can therefore be reused across all .NET languages, and this metalinguistic programming trend is being discussed on the C# and other language design teams.

I’ve just started using Mono.Cecil, chosen because it is cross-platform friendly (and open source).  The API isn’t very intuitive, but because the source is available, and because extension methods can go a long way to making it more accessible, it’s a great option.  The documentation is sparse, and assembly generation has some performance issues, but it’s a work-in-progress with tremendous potential.  If you’re doing any kind of static analysis or have any need to dynamically generate and consume types and assemblies (to get around language limitations, for example), I’d encourage you to check it out.  A comparison of Mono.Cecil to System.Reflection can be found here.  Another library called LinFu, which performs lots of mind-bending magic and actually uses Mono.Cecil, is also worth exploring.

VB10 will supposedly be moving to the DLR to become a truly dynamic language, which considering their history of support for late binding, makes a lot of sense.  With a dynamic language person on the C# 4.0 team (Jim Hugunin from IronPython), one wonders if C# won’t eventually go the same route, while keeping its strongly-typed feel and IDE feedback mechanisms.  You might laugh at the idea of C# supporting late binding (dynamic lookup), but this is being planned regardless of the language being static or dynamic.

As the DLR evolves, performance optimizations are being discovered and implemented that may close the gap between pre-compiled and dynamically interpreted languages.  Combine this with manageable concurrent execution, and the advantages we normally attribute to static languages may soon disappear altogether.

The Precipitous Growth of Software System Complexity

We’re truly on the cusp of a precipitous period of growth for software complexity, as an exploding array of devices and diverse platforms around the world connect in an ever-more immersive Internet.  Taking full advantage of parallel and distributed computing environments by solving the challenges of concurrency and coordination, as well as following the trend toward increased integration among software components, is pushing software complexity into new orders of magnitude.  The strategies we come up with for organizing these systems will have to take several key factors into consideration, and we will have to raise the level of abstraction to a point that may be hard for us to imagine with our existing tools and languages.

One aspect that’s clear is the rise of declarative or intention-based syntax, whether represented as XML, Domain Specific Langauges (DSL), attribute decoration, or a suite of new visual modeling editors.  This is in part a consequence of raising the abstraction level, as lower-level libraries are entrusted to solve common problems and take advantage of common opportunities.

Another is the use of Inversion of Control (IoC) containers and dependency injection in component based architectures, thereby standardizing the lifecycle of the application and its components, and providing a common environment or ecosystem for all of its components, as well as introducing a common protocol for component location, creation, access, and disposal.  This level of consistency is valuable for sharing a common understanding of how to troubleshoot software components.  The more predictable a component’s interaction with the rest of the system, the easier it is to debug and modify; conversely, the more unique it and its communication system is, the more disparity there is among components, and the more difficult to understand and modify without introducing errors.  If software is a species and applications are individuals, then components are the cells of a system.

Even the introduction of functional programming languages into the mainstream over the past couple years is due, in part, to the ability of those languages to provide more declarative support, more syntactic flexibility, and new ways of dealing with concurrency and coordination issues (such as immutable values) and light-weight, ad hoc data structures (tuples).

Balancing the Forces of Coupling, Cohesion, and Modularity

On a fundamental level, the more that components are independent, the less coupled and the more modular and flexible they are.  But the more they can communicate with and are allowed to benefit from each other, the more interdependent they become.  This adds to cohesiveness and synergy, but also stronger coupling to a community of abstractions.

A composition of services has layers and segments of interdependence, and while there are dependencies, these should be dependencies on abstractions (interfaces and not implementations).  Since there will be at least one implementation of each service, and the extensibility exists to build others as needed, dependency is only a liability when the means for fulfilling it are not extensible.  Both sides of a contract need to be fulfilled regardless; service-oriented or component-based designs merely provide a mechanism for each side to implement and fulfill its part of the contract, and ideally the system also provides a discovery mechanism for the service provider to publish its availability for other components to discover and consume it.

If you think about software components as a hierarchy or tree of services, with services of one layer depending on more root services, it’s easy to see how this simplifies the perpetual task of adding new and revising existing functionality.  You’re essentially editing an outline, and you have opportunities to move services around, reorganize dependencies easily, and have many of the details of the software’s complexity absorbed into this easy-to-use outline structure (and its supporting infrastructure).  Systems of arbitrary complexity become feasible, and then relatively routine.  There’s a somewhat steep learning curve to get to this point, but once you’ve crossed it, your opportunities extend endlessly for no additional mental cost.  At least not in terms of how to compose your system out of individual parts.

Absorbing Complexity into Frameworks

The final thing I want to mention is that a rise in overall complexity doesn’t mean that the job of software developers necessarily has to become more difficult than it is currently.  With the proper design of components that abstract away the complexity into reusable frameworks with intuitive interfaces, developers at the business logic level don’t need to be aware of the inner complexity, in the same way that software developers are largely absolved of the responsibility of thinking about the processor’s inner workings.  As we build our technology stack higher and higher, like the famed Tower of Babel, we must make sure that it’s organized and structured in a way to support that upward growth and the load imposed upon it… so it doesn’t come crashing down.

The requirements for building components tomorrow will not be the same as they were yesterday.  As illustrated in this account of the effort involved in a feature change at Microsoft, in the future, we will also want to consider issues such as tool-assisted refactorability (and patterns that frustrate this, such as “magic strings”), and due to an explosion of component libraries, discoverability of types, members, and their use.

A processor can handle any complexity of instruction and data flow.  The trick is in organizing all of this in a way that other developers can understand and work with.

Posted in Compact Framework, Component Based Engineering, Concurrency, Design Patterns, Development Environment, Distributed Architecture, Functional Programming, Mobile Devices, Object Oriented Design, Problem Modeling, Reflection, Service Oriented Architecture, Software Architecture, Visual Studio | 1 Comment »

Bad Ass Development Rig

Posted by Dan Vanderboom on August 23, 2008

[The powerful workstation described in this article is now for sale on eBay! Click here to see!]

The Need For Speed

I’m not a gamer, I’m a developer.  When I’m on my computer for eight to ten hours a day, I’m typically not rendering graphics, but rather writing, compiling, and testing code.  The writing part hardly requires any resources, but compiling code completely pegs out one of the processors on my dual core laptop (a 2.4 GHz Dell Latitude D830).  Parallel compilers exist, but C# in Visual Studio is not one of them, and by the sound of things, won’t be for quite some time.  This means that if I’m going to see a significant performance increase of this critical task, I’m going to need the fastest processor I can get (and overclock).

Compiling code is also disk intensive, especially toward the end of a build when output files are written to disk.  I ran some benchmarks of C# builds (in Visual Studio) of SharpDevelop.  I chose this code base because it’s fairly large, similar to my own solutions, and it’s open source so others can repeat our tests.  We tracked utilization of individual processors, disk I/O, etc.

Why am I so hell bent on compiling code as fast as possible?  Good question.

mobo

Micro Development Cycles

Software development consists of nested cycles.  There are organizational cycles that envelop project cycles that envelop several-week development sprints, and at the smallest level, it really all boils down to executing many micro development cycles every day.  You start with a goal such as fixing a bug or implementing a feature, do some generally-informal design in your head, plan out your work (again, typically in your head), write code for a few minutes, compile and fix until the build succeeds, deploy if necessary, test the changes, and repeat this sequence anywhere from 20 to 50 or more times in a productive day.  If you do test driven development, you write your tests before the functional code itself, but the cycle is otherwise essentially the same.

Develoment Cycle

Some of these steps take longer than others, and some of them, like designing or thinking about what code to write and where that logic belongs, are creative in nature and can’t be rushed.  But when we work on larger solutions in Visual Studio (and other tools), the time for tools to perform critical processing (compiling code in this case) can lead to Twiddling Thumb Syndrome (TTS).  This is not only an unfortunate affliction, it’s also one that can cause Turret-like symptoms, including swearing at one’s computer, at one’s software tools, and banging on things to entertain oneself while waiting for things to finish.  Sometimes, depending on your projects’ interdependencies and other details, build times can shoot up to several minutes (or worse) per build.  For a long time, I was getting build times in the 5-7 minute range, and it grows as solutions become larger.  Repeat this just 20 times (during which your computer is totally unresponsive) and you’ll quickly get the idea that you’re wasting lots of valuable time (two hours a day!).

Clearly this is unacceptable.  Even if my builds only took a minute, all of the aggregated time spent waiting for progress bars of all kinds (not just compiling) can add up to a significant chunk of wasted time.  In Scott Hanselman’s Ultimate Developer Rig article, which played a part in motivating me to build my own Ultimate Developer Rig, Scott hits the nail on the head:

I don’t want to have time to THINK about what it’s doing while I wait. I wait, in aggregate, at least 15 minutes a day, in a thousand tiny cuts of 10 seconds each, for my computer to finish doing something. Not compile-somethings, but I clicked-a-button-and-nothing-happened-oh-it-was-hung-somethings. Unacceptable. 15 minutes a day is 21.6 hours a year – or three full days – wasted.

I think Scott is being too conservative in his estimate.  It’s easy to waste at least 20-30 minutes a day waiting for general sluggishness, and considerably more when waiting for builds of large solutions.  If you have a computer that’s a few years old, it’s probably worse.  Thirty minutes a day is about 125 hours per year (over 3 weeks), and an hour a day is 6 weeks per year.

Flow = Mental Continuity

Look at it from another perspective.  Even if wasted time isn’t an issue, there’s still a matter of maintaining continuity of thought (and execution).  When we have a plan and are ready to act on it, but are held back behind some bottleneck in the process, we risk losing the fluid flow or mental momentum that we’ve built up.  Often, I have a sequence of pretty good ideas or things I’d like to try, but I end up waiting so long for the first step to finish, that by the time the computer is ready, I’ve lost track of my direction or next step.  This isn’t as much of a problem with long-term planning because those goals and steps tend to be written down, perhaps tracked in some kind of Scrum tool.  But when we’re talking about micro development cycles, a post-it note can be too formal (though when I’m waiting a lot, I do use these).  If we could get near-immediate feedback on our coding choices and reduce the wait time between the execution of tasks, we could maintain this flow better and longer, and our work would benefit from the increased mental continuity.

One analogy is that of reading a programming book.  Some of them are 800-1000 pages or more.  When you read one slowly, say a chapter every other week, it takes so long to read that by the time you finish chapter 10, you have a really hard time remembering what chapter 2 was all about.  But if you focus more and read through the same book in a week, then chapter 2 will still be fresh in your mind when you get to chapter 10, and you’ll be much better able to relate and connect ideas between them.  The whole book sticks in your memory better because all of its content is more cohesive in your mind.

Cost Justification

Scott created a nice computer for the price range he was shooting for, but for my own purposes, I wanted to go with something more extreme.  When I started playing with the numbers, I asked myself what the monthly cost would be for a top-of-the-line, $5,000 to $6,000 power machine.  Spread over 3 years, it comes to only $166 per month.  If you consider the proportion of this cost to the salary of a developer, figure out how much all of our unnecessary wasted time is worth, and realize that this is the primary and constantly-used hardware tool of an engineer, I think it’s very easy to justify.  This isn’t some elliptical trainer that’ll get used for two weeks and then spend the next five years in the garage or the storage shed.  This beast of burden will be put to serious work every day, and will make it easier and more pleasant to get work done.  In an age where we don’t even blink an eye at spending $1,000 on comfortable and ergonomic Herman Miller chairs, I think we’re long overdue for software engineers to start equipping themselves with appropriately-powerful computer hardware.

Compare the cost of a great workstation with the tools of other trades (carpentry, plumbing, automotive repair, etc.) and you’ll find that software development shops like to cut corners and go cheap on hardware because it’s possible to do so, not because it makes the most sense and delivers the greatest possible value.  If you’re in a warehouse and need a forklift, there’s no two ways about it.  But computers are commodities, and though they come in all shapes, sizes, and levels of power, the software you need will normally run on the slowest and most sluggish among them.

Welcome to My Home Office

Welcome to my office.  Since it’s going to appear in the background of many pictures, I thought I’d give a quick tour.  This is my brain dump wall, where many of my ideas first take form.

IMG_0080

And around the corner from this room is the greatest Jack Daniel’s bar in the world, built by Christian Trauth (with a little bit of help from myself).

5

Bad Ass Components

I decided to take a field trip one day, and drove from the Milwaukee area where we live down to Fry’s in Chicago.  This was my first time to a Fry’s.  If you’ve never been to one, just imagine Disney World for computer geeks.  They’re absolutely huge (about 70,000 square feet of computer parts and other electronics).  I bought almost everything I needed there, having ordered a few parts online before this field trip took place.

Here’s what I picked up:

Intel D5400XS “SkullTrail” Motherboard – $575
Intel Core 2 Extreme Processor (QX97750) – $1510 (Tom’s Hardware Review)
  • 3.20 GHz (without overclocking)
  • 1600 MHz FSB
  • 12 MB L2 Cache
ThermalTake Bigwater 760is – $170
  • 2U Bay Drives Liquid Cooling System
Adaptec 5805 RAID Controller – $550
  • 8-Lane PCI Express
  • 512 MB DDR2 Cache
  • Battery Backup
3 Western Digital Velociraptor Hard Drives – $875
  • 900 GB Total
  • 10,000 rpm
  • SATA
8 GB (4 x 2 GB) of PC2-6400 RAM – $400
  • 800 MHz
  • ECC
  • Fully Buffered
GeForce 9800 GTX Video Card – $250
  • PCI Express 2.0
  • SLI Ready
  • 512 MB DDR3
Coolermaster Case – CMStacker 830 SE – $350
  • 1000 Watt Power Supply
  • Lots of Fan Slots
  • Very Modular

Total Damage – $4720

This doesn’t include extra fans (still need to purchase about 11 of them), and the things I already have: a pair of 24 inch monitors, Logitech G15 gaming keyboard (nice for the extra programmable keys), mouse, CD/DVD burner, media card reader, etc.  (When I calculate the cost at $166 per month over 3 years, it’s based on a total price tag of $6,000.)

Building a Bad Ass Development Rig

In the first picture are boxes for the case (and included 1000 Watt power supply), motherboard, video card, memory, and liquid cooling system.  The next two pictures show the motherboard mounted on the motherboard tray, which slides easily into the back of the case.  Notice how huge the video card is (on the right).  It takes up two slots for width, though it doesn’t plug into two slots (I’m not really much of a gamer, so no SLI for me).  The smaller card in the picture on the right is the Adaptec RAID controller.  I chose the slots that I did to maximize airflow; when I first put the graphics card in, it was partially obstructing a fan on the motherboard, so I moved it to the edge.  This blocked a connector on the motherboard, so I ended up moving it again.  Finding the right setup was a matter of trial and error.

1 

Below you can see all the power cables hanging from the case.  They’re wrapped in a strong mesh that keeps the cables bundled together for improved airflow.  On the right, you can see a swinging door with dust filters and empty spaces for four fans (up to 150mm, not included with the case).  Notice the fan on the motherboard tray, and there’s a slot for another one in the roof of the case that you can’t see.  In addition to the fans, the sides, bottom, top, and front all let air pass through for maximum airflow.  The drives on the right are Western Digital Velociraptors: 300 GB and 10,000 rpm.  When set up in a RAID 0 (striping) configuration, they should provide wicked fast disk access, which is important because I’ll be running multiple virtual machines at a time.

2

Next you can see the modular drive cage, which is easier to install the drives in when it’s removed from the case (a couple screws on each side).  It’s nice that it has a fan on the front.  Overall, I’m very impressed with the case and all the attention to detail that’s gone into its design.  It was extremely easy to work with and reach everything.  It’s been several years since I’ve built a desktop computer, and I remember a lot more frustration when trying to reach inside of cases.  Notice that when I put the drive cage back in, I installed it higher up.  I couldn’t put it any higher because of power cables getting in the way, but I need a place for a CD/DVD burner and maybe a media reader anyway.  I moved it up because I’ll be installing a liquid cooling unit, and the radiator takes up two drive height units (2U).  If there’s any chance of that thing leaking (it better not!), I certainly don’t want water dripping down onto my hard drives.

3

Now I’m starting to wire everything up: power and reset switches, power LEDs, USB and Firewire ports, power to the motherboard and video card, power to the hard drives, and the interface cables from the RAID controller to the hard drives.  I start twist-tying the slack on cables and stuff the unused power cables into the ceiling of the case, where they stow away nicely (with more twist ties).  The right-most picture shows some of the other stuff included with the case: an IDE cable bound inside a tubular plastic sheath (for better airflow), SATA cables that I didn’t need because I used the ones that came with the RAID controller, a fan mount for one of the 11 heat sinks on the motherboard (fan not included), and a fun Do-Not-Disturb-style doorknob sign (included with the motherboard) that says “Warning: Noobs Beware.  You will be Pwned.”  And indeed you will be!

4

It’s finally time for some liquid cooling action.  With a motherboard called SkullTrail that was designed for overclocking two 771 processing chips, and a single QX9775 Core 2 Extreme quad core 3.2 GHz processor (to start with), you better believe I’ll be overclocking this bad ass machine to its limit!  I’ve heard rumors that 4.5 GHz is very manageable, and am hoping to be able to pull off upwards of 5 GHz, but we’ll see how it goes (with another processor, this would total 40 GHz across 8 parallel cores).  So far, the liquid cooling tops all other components in documentation: one user guide and one maintenance guide.  And don’t forget that you can’t take a sip of the cooling fluid, or eat any of the rock candy packets that come with the hard drives.  I know it’s tempting.

IMG_0154

I Hit a Snag

The liquid cooling unit doesn’t fit.  It’s close, and I debated whether to let it hang out of the front of the case an inch and a half because of the motherboard being too close.  Not good enough.  Back to the drawing board!

6 

The only way the cooling unit would fit flush in the case was if we cut out one of the aluminum support beams along the top of the case, and inserted the cooling unit in the top drive bays.  This would put the liquid above my hard drives, which I was trying to avoid, but I didn’t have any choice at this point.  So we jumped in his car and stopped at his place to pick up a dremmel.  Ten minutes later we were in the garage, case stripped down and upside down, cutting away.  You can see the end result in the photo on the right, which turned out very nice.

7

Finally, the liquid cooling system fits flush in the case.  We noticed that the cooling system had a fan speed control rheostat connected to it on a wire, and thought it would be nice if we could expose that through the case somehow, so we drilled a hole and fed it through the top (near the power and reset buttons).  I found a knob that fit on it from a robotics kit I purchased a few months ago, and it even matched the color of the case.  Bonus!  You can see the new knob in the picture below on the right.

8

Almost ready to boot up!  I’m waiting for the processor to arrive, and expect it any day now.  As soon as that comes in, I’ll be writing the next article in this Bad Ass Development Rig series, and we’ll see how much we can get this bad mamma jamma overclocked (without making it unstable, of course).  After that, I’ll be setting up the virtual machine system, all of my development environments, and then we’ll do some serious benchmarking.

Posted in Development Environment, Hardware, Personal | 16 Comments »

Misadventures in Pursuit of an Immutable Development Virtual Machine

Posted by Dan Vanderboom on May 23, 2008

Problem

Every three to six months, I end up having to rebuild my development computer.  This one machine is not only for development, but also acts as my communications hub (e-mail client, instant messenger, news and blog aggregator), media center, guitar effects and music recording studio, and whatever other roles are needed or desired at the time.  On top of that, I’m constantly installing and testing beta software, technical previews, and other unstable sneak peeks.  After several months of this kind of pounding, it’s no wonder the whole system doesn’t grind to a complete halt.

This is an expensive and tedious operation, not to mention time lost from poor performance.  It normally takes me a day and a half, sometimes two days, to rebuild my machine with all of the tools I use on a regular or semi-regular basis.  Drivers, SDKs, and applications have to be installed in the correct order, product keys have to be entered and software activated over the Internet and set up the way I want, wireless network access and VPN connections have to be configured, backups have to be made and application data restored once they’re reinstalled, and there is never a good time for all of the down time.  A developer’s environment can be a very complicated place!

If it’s not error messages and corruption, it’s performance problems that hurt and annoy the most.  There’s a profound difference in speed between a clean build and one that’s been clogged with half a year or more of miscellaneous gunk.  It can mean the difference in Visual Studio, for example, between a 30 second build and three or four minutes of mindless disk thrashing.

Using an immutable development machine means that any viruses that you get, or registry or file corruption that occurs–any problems that arise in the state of the machine–never gets saved, and therefore disappears the next time you start it up.  It is important, however, that everything within the environment is set up just the way you want it.  If you set up your image without ever opening Visual Studio, for example, you’ll always be prompted with a choice about the style of setup you want, and then you’d have to wait for Visual Studio to set itself up for the first time, every time.

Still, if you invest a little today in establishing a solid environment, the benefits and savings over the next year or two can be well worth the effort.  As I discovered over the past week and a half, there are a number of pitfalls and dangers involved.  If you’ve considered setting up something similar, I hope the lessons I’ve learned will be able to save you much of the trouble that I went through.

Solution

After listening to Security Now! and several other podcasts over the past couple of years about virtual machines and how they’re being used for software testing to ensure a consistent starting state, I began thinking of how nice it would be if I could guaranty that my development environment would always remain the same.  If I could get all of my tools installed cleanly to maximize performance and stability, and then freeze it that way, while still allowing me to change the state of my source code files and other development assets, I might never have to rebuild my computer again.  At least, not until it’s time to make important software upgrades, but then it would be a controlled update, from which I could easily roll back if necessary.

But how can this be done?  How can I create an immutable development environment and still be able to update my projects and save changes?  By storing mutable state on a separate drive, physical or virtual, which is not affected by virtual machine snapshots.  It turns out to be not so simple, but achievable nonetheless, and I’ll explain why and how.

If the perfect environment is to be discovered, I have several more criteria.  First, it has to support, or at least not prevent, the use of multiple monitors.  Second, I’d like to synchronize as much application data as possible across multiple computers.  Third, as I do a lot of mobile device development, I need access to USB and other ports for connecting to devices.

Implementation

For data synchronization across machines, I’ve been using Microsoft’s Mesh.com which is based on FeedSync, and is led by Ray Ozzie.  Based on my testing over the past two weeks, it actually works very well.  Though it’s missing many of the features you would expect from a mature synchronization platform and toolset, for the purposes of my goals explained in this article, it’s a pretty good choice and has already saved me a lot of time where I would have otherwise been transferring files to external drives and USB keys, or e-mailing myself files and trying to get around file size and content limitations.  If this is the first time you’ve heard of Mesh, make a point of learning more about it, and sign up for the technical preview to give it a test drive!  (It’s free.)

Originally I wanted to use Virtual PC to create my development virtual machine, however as of the 2007 version, it still does not support USB access, so immediately I had to rule it out.  I downloaded a demo of VMWare’s Workstation instead, which does support USB and provides a very nice interface and set of tools for manipulating VMs.

The diagram below illustrates the basic system that I’ve created.  You’ll notice (at the bottom) that I have multiple development environments.  I can set these environments up for different companies or software products that each have unique needs or toolsets, and because they’re portable (unlike normal disk images), I can share them with other developers to get them up and running as quickly as possible.

Ideal Development Environment

Partitioning of the hard drive, or the use of multiple hard drives, is not strictly necessary.  However, as I’m working with a laptop and have only a single disk, I decided to partition it due to the many problems I had setting up all of the software involved.  I rebuilt the machine so many times, it became a real hassle to restore the application data over and over again, so putting it on a separate partition allowed me to reformat and rebuild the primary partition without constantly destroying this data.

My primary partition contains the host operating system, which is Windows XP Professional SP3 in my case.  (If you’re using Vista, be aware that Mesh will not work with UAC (user account control) disabled, which I find both odd and irritating.)  The host OS acts as my communication workstation where I read e-mail, chat over messenger, read blogs and listen to podcasts, surf the Internet and save bookmarks, etc.  I always want access to these functions regardless of the development environment I happen to have fired up at the time.

Mesh is installed only on the host operating system.  To install it on each virtual machine would involve having multiple copies of the same data on the same physical machine, and clearly this isn’t desirable.  I considered using VMWare’s ESXi server, which allows you to run virtual machines off the bare metal instead of requiring a host operating system, but as I always want my communications and now also synchronization software running, their Workstation product seemed like an adequate choice.  Which is great because it’s only $189 at the time I’m writing this, opposed to $495 for ESXi Server.

With the everyday software taken care of in the host OS, the development virtual machines can be set up unencumbered by these things, simplifying the VM images and reducing the number of friction points and potential problems that can arise from the interaction of all of this software on the same machine.  That alone is worth considering this kind of setup.

Setting up my development VM was actually easier than installing the host OS since I didn’t have to worry about drivers.  VMWare Workstation is very pleasant to use, and as long as the host OS isn’t performing any resource intensive operations (it is normally idle), the virtual machine actually runs at “near native speed” as suggested by VMWare’s website.  The performance hit feels similar to using disk encryption software such as TrueCrypt.  With a 2.4 GHz dual-core laptop, it’s acceptable even by my standards.  I’m planning to start using a quad-core desktop soon, and ultimately that will be a better hardware platform for this setup.

Hiccup in the Plan (Part 1)

The first problem I ran into was in attempting to transfer a virtual machine from one computer to another.  Wanting to set up my new super-environment on a separate computer from my normal day-to-day development machine, I set up VMWare on one computer and when I thought I had the basics completed, I attempted to transfer the virtual machine to my external hard drive.  Because the virtual disk files were over 2 GB and the external hard drive is formatted as FAT32 (which has a file size limitation of 2 GB), I was immediately stopped in my tracks.  I tried copying the files over the local network from one computer to the other, but after 30 minutes of copying, Windows kindly informed me that a failure had occurred and the file could not, in fact, be found.  (Ugh.)  Lesson learned: VMWare has an option, when creating a new virtual machine, to break up virtual disks into 2 GB files.  From that point on, I decided not only to use this option, but also to simply build the virtual machines on the actual target computer, just in case.

The next trick with VMWare was in allowing the virtual machine to share a disk with the host operating system.  My first route was to set up a shared folder.  This is a nice feature, and it allows you to select a folder in your host OS to make visible to the virtual machine.  I thought it would be perfect.  However, Visual Studio saw it as non-local and therefore didn’t trust it.  In order to enable Visual Studio to trust it, you have to change machine-level security configuration (in the VM).  There are two ways of doing this: there is a .NET Configuration tool (mscorcfg.msc) with a nice user interface, and then there is the command-line caspol.exe tool with lots of confusing options and syntax to get right.

Navigating to Administrative Tools, I was stumped to find that I didn’t have this nice GUI tool any more.  I’ve fully converted everything to Visual Studio 2008 and no longer work in 2005, so the last time I built my machine, I ran the VS2008 install DVD.  I learned the hard way that Microsoft no longer includes this tool in the new Visual Studio 2008 install DVD.  I Googled around to discover that Microsoft, for reasons unknown, did in fact decide to remove this tool from their installer, and that if I wanted to have it, I would have to install (first) the .NET 2.0 Redistributable, (second) the .NET 2.0 SDK, and (finally) Visual Studio 2008.  This would mean rebuilding the VM… again.  I tried caspol.exe briefly and finally gave up (the example I found in the forums didn’t work), telling myself that I really did want this GUI tool anyway, and if I was going to set up the perfect development environment, it was worth one more rebuild to get it right.

Several blue screens of death, some puzzling file corruption, and two reinstallations later, I was thinking that the prescribed solution I was attempting wasn’t all it was cracked up to be after all.  Whoever suggested it was messing with me, and was probably crazy anyway.  I did eventually get these components installed and working by simply repeating the procedure, and after using the configuration tool, Visual Studio did seem pretty happy to open the solutions and build them for me.

Until I opened my other solution and tried to build it, that is.  I keep custom controls and designers in a separate solution because of a post-build task that runs and registers the control designers (which is itself an infuriating requirement, but that’s for another article).  Whenever I tried building these projects, I would get an error that access was denied to create some kind of resource file.  I looked at the properties of the shared folder and saw that the file system claimed to be HPFS instead of NTFS.  HPFS is a proprietary format of VMWare’s that somehow provides an accessibility tunnel to the real underlying storage format, and I don’t know anything else about it, but I wouldn’t be surprised if that didn’t have something to do with my problem.  Visual Studio does some finicky things, and VMWare certainly does its share of hocus pocus.  Figuring out the possible interaction between them was going to be beyond my voodoo abilities and resources, so I had to find another way around this shared disk situation if I planned on developing custom controls in this environment.

Hiccup in the Plan (Part 2)

My Dell Latitude D830 is four months old.  I requested a Vostro but was absolutely refused by the company I work for, who shall remain nameless.  Supposedly the Latitude’s are a “known quantity”, will have fewer problems, and are therefore better for support reasons.  Regardless, the D830 is for the most part a good, fast machine.  This one in particular, however, became a monster in the past week during the time I was trying to get this new setup working, costing me a full week of lost time and a great deal of frustration.  Every time I thought I had isolated the cause, some other misbehavior appeared to confuse matters.  Each step of troubleshooting and repair seemed reasonable at the time, and yet as new symptoms emerged, the dangling carrot moved just beyond my reach, my own modern reenactment of Sisyphus’s repeated torment.

MeshCorruption

Not only was I getting Blue Screens of Death several times a day, but CHKDSK would start up before Windows and all kinds of disk problems would be discovered, such as orphaned files and bad indexes.  Furthermore, the same things were happening with the virtual disks, and VMWare reported fragmentation on those disks immediately after installing the operating system and a few tools.  There were folders and files I couldn’t rename or delete, and running the Dell diagnostics software turned up nothing at all.

Finally, having a second D830 laptop, the Dell tech suggested swapping hard drives.  After installing my whole environment, plus the VMs, about a dozen times, I really didn’t feel like going through this yet again, but it seemed like a reasonable course of action, and so I went through the process again.  Getting almost all the way through everything without a problem, I finally (with a smile) rebooted my VM and waited for it to come back up, only to see CHKDSK run and find many pages of problems once again.

Warning: The great thing about Mesh is that you can make changes to your files, such as source code, recompile, and all of those changes shoot up into the cloud in a magical dance of synchronization, and those changes get pushed down to all the other computers in your mesh.  What’s scary, though, is when you have a hard drive with physical defects that corrupts your files.  Those corruptions also get pushed up to the cloud, and this magically corrupts the data on all of the computers in your mesh.  So be aware of this!

The Value of Offline Backups

Make backups.  Check your code into version control.  Mesh is a great tool for synchronizing data, and initially I thought this would be sufficient for backups of at least some of my data, but it falls short in several ways.

First, Mesh doesn’t version your backups.  Once you make a change and you connect to the Internet, everything gets updated immediately.  If data is accidentally deleted or corrupted, these operations will replicate to the cloud and everywhere in your mesh.  Versioned backups, as snapshots in time, are the only way to ensure that you can recover historical state if things go awry as they did for me.

Second, Mesh is great for synchronizing smaller, discrete files that either aren’t supplemented with metadata, or whose metadata also exists in files within the same folder structure and which also gets synchronized.  By the latter, I mean systems such as Visual Studio projects and files: the files are referenced by project files, and project files referenced by solution files, but these are all small, discrete files themselves that can also be seamlessly synchronized.  When I add a file to a project and save, Mesh will update the added file as well as the project file.

Application data that doesn’t work well would be any kind of monolithic data store, such as a SQL Server database or an Outlook’s (.pst) data file.  Every time you got an e-mail and your .pst file changed, the whole file would be sent up to Mesh, and if your e-mail files get as large as mine, that would be a problem.  Hopefully in the future, plug-ins will be developed that can intelligently synchronize this data as well through Mesh.

I’m using and highly recommend Acronis TrueImage for backups.  It really is a modern, first-rate backup solution.

Conclusion

In the end, Dell came and replaced my motherboard, hard drive, and DVD-RW drive (separate problem!), and I was able to get back to building my immutable development environment.  Instead of using shared folders, VMWare lets you add a hard drive that is actually the physical disk itself, or a partition of it.  Unfortunately, VMWare doesn’t let you take a snapshot of a virtual machine that has such a physical disk mounted.  I don’t know why, and I’m sure there’s a reason, but the situation does suck.  The way I’ve gotten around it is to finish setting up my environment without the additional disk mounted, take a snapshot, and then add the physical disk.  I’ll run with it set up for a day or two, allowing state to change, and then I’ll remove the physical disk form the virtual machine, revert to the latest snapshot, and then add the physical disk back in to start it up again.  This back-and-forth juggling of detaching and attaching the physical disk is less than ideal, but ultimately not so bad as the alternative of not having an immutable environment, and I haven’t had the last word quite yet.

I’ll continue to research and experiment with different options, and will work with VMWare (and perhaps Xen) to come up with the best possible arrangement.  And what I learn I will continue to share with you.

Posted in Custom Controls, Development Environment, Uncategorized, Virtualization, Visual Studio | 5 Comments »