Critical Development

Language design, framework development, UI design, robotics and more.

Archive for the ‘Cloud Computing’ Category

Reimagining the IDE

Posted by Dan Vanderboom on May 31, 2010

Overview

After working in Visual Studio for the past decade, I’ve accumulated a broad spectrum of ideas on how the experience could be better.  From microscopic features like “I want to filter Intellisense member lists by member type” to recognition of larger patterns of conceptual organization and comprehension, there aren’t many corners of the IDE that couldn’t be improved with additional features—or in some cases—a redesign.

To put things in perspective, consider how the Windows Mobile platform languished for years and become stale (or “good enough”) until the iPhone changed the game and raised the bar on quality to a whole new level.  It wasn’t until fierce competition stole significant market share that Microsoft completely scrapped the Windows Mobile division and started fresh with a complete redesign called Windows Phone 7.  This is one of the smartest things Microsoft has done in a long time.

After many years of incremental evolution, it’s often necessary to rethink, reimagine, and occassionally even start from scratch in order to make the next revolutionary jump forward.

Visual Studio Focus

Integrated Development Environments have been with us for at least the past decade.  Whether you work in Visual Studio, Eclipse, NetBeans, or another tool, there is tremendous overlap in the set of panels available, the flexible layout of those panels, saved workspaces, and add-in infrastructure to make as much as possible extensible.  I’ll focus on Visual Studio for my examples and explanations since that’s the IDE I’m most familiar with, but there are parallels to other IDEs for much of what I’m going to cover.

Visual Components & Flexible Layout

Visual layout is one thing that IDEs do right.  Instead of a monolithic UI, it’s broken down into individual components such as panels, toolbars, toolboxes, main menus and context menus, code editors, designers, and more.  These components can be laid out at runtime with intuitive drag-and-drop operations that visually suggest the end result.

The panels of an IDE can be docked to any edge of another panel, they can be laid on top of another panel to create tab controls, and adjacent panels can be relatively resized with splitters that appear between panels.  After many years of refinement, it’s hard to imagine a better layout system than this.

The ability to save layouts as workspaces in Expression Blend is a particularly nice feature.  It would be nicer still if the user could define triggers for these workspaces, such as “change layout to the UI Designer workspace when the XAML or Windows Forms designers are opened”.

IDE Hosting

Visual Studio and other development tools have traditionally been desktop applications.  In Silverlight 4, however, we now have a framework sufficiently powerful to build a respectable cross-platform IDE.

With features such as off-line, out-of-browser execution, full screen mode, custom context menus, and trusted access to the local file system, it’s now possible for a great IDE to be built and run on Windows, Mac OS X, or Linux, and to allow a developer to access the IDE and their solutions from any computer with a browser (and the Silverlight plug-in).

There are already programming editors and compilers in the cloud.  In episode 562 of .NET Rocks on teaching programming to kids, their guests point out that a subset of the Small Basic IDE is available in Silverlight.  For those looking to build programming editors, ActiPro has a SyntaxEditor control in WPF that they’re currently porting to Silverlight (for which they report seeing a lot of demand).

Ideally such an IDE would be free, or would have a free version available, but for those of us who need high-end tools and professional-level features sets, imagine how nice it would be to pay a monthly fee for access to an ever-evolving IDE service instead of having to cough up $1,100 or $5,500 (or more) every couple years.  Not only would costs be conveniently amortized over the span of the tool’s use, but all of your personal preferences would be easily synchronized across all computers that you use to work on that IDE.

With cloud computing services such as Windows Azure, it would even be possible to off-load compilation of large solutions to the cloud.  Builds that took 30 minutes could be cut down to a few minutes or less by parallelizing build tasks across mutliple cores and servers.

The era of cloud development tools is upon us.

Solution Explorer & The Project System

Solution Explorer is one of the most useful and important panels in Visual Studio.  It provides us with an organizational tool for all the assets in our solution, and provides a window into the project system on which core behaviors such as builds are based.  It is through the Solution Explorer that we typically add or remove files, and gain access to visual designers and the ever-present code editor.

In many ways, however, Solution Explorer and the project system it represents are built on an old and tired design that hasn’t evolved much since its introduction over ten years ago.

For example, it still isn’t possible to “add existing folder” and have that folder and all of its contents pulled into a project.  If you’ve ever had to rebuild a project file and pull in a large number of files organized in many nested folders, you have a good idea of how painful an effort this can be.

If you’ve ever tried sharing the same code across multiple incompatible platforms, between Full and Compact Framework, or between Silverlight 3 and Full Framework, you’ve likely run into kludgey workarounds like placing multiple project files in the same folder and including the same set of files, or using a tool like Project Linker.

Reference management can also be unwieldy when you have many projects and references.  How do you ensure you’re not accidentally referencing two different versions of the same assembly from two different projects?  My article on Project Reference Oddness in VS2008, which explores the mysterious and indirect ways references work, is by far one of my most popular articles.  I’m guessing that’s because so many people can relate to the complexity and confusion of managing these dependencies.

“Projects” Are Conceptually Overloaded: Violating the Single Responsibility Principle

In perhaps the most important example, consider how multiple projects are packaged for deployment, such as what happens for the sake of debugging.  Which assemblies and other files are copied to the output directory before the program is executed?  The answer, discussed in my Project Reference Oddness article, is that it depends.  Files that are added to a project as “Content” don’t even become part of the assembly: they’re just passed through as a deployment command.

So what exactly is a Visual Studio “project”?  It’s all of these things:

  • A set of source code files that will get compiled, producing an assembly.
  • A set of files that get embedded in the resulting assembly as resources.
  • A set of deployment commands for loose files.
  • A set of deployment commands for referenced assemblies.

If a Visual Studio project were a class definition, we’d say it violated the Single Responsibility Principle.  It’s trying to be too many things: both a definition for an assembly as well as a set of deployment commands.  It’s this last goal that leads to all the confusion over references and deployment.

Let’s examine the reason for this.

A deployment definition is something that can span not only multiple assemblies, but also additional loose files.  In order to debug my application, I need assemblies A, B, and C, as well as some loose files, to be copied to the output directory.  Because there is no room for the deployment definition in the hierarchy visualized by Solution Explorer, however, I must somehow encode that information within the project definitions themselves.

If assembly A references B, then Visual Studio infers that the output of B needs to be copied to A’s output directory when A is built.  Since B references C, we can infer that the output of C needs to be copied to B’s output directory when B is built.  Indirectly, then, C’s output will get dumped in A’s output directory, along with B’s output.

What you end up with is a pipeline of files that shuffles things along from C to B to A.  Hopefully, if all the reference properties are set correctly, this works as intended and the result is good.  But the logic behind all of this is an implicit black box.  There’s no transparency, so when things get complicated and something goes wrong, it can become impossible to figure it out in a reasonable amount of time (try reading through verbose build output sometime).

At one point, just before writing the article on references mentioned above, I was spending 10 hours or more a week just fighting with reference dependencies.  It was a huge mess, and a very expensive way to accomplish absolutely nothing in terms of providing value to customers.

Deployments & Assemblies

Considering our new perspective on the importance of representing deployments as first-class organizational items in solutions, let’s take a look at what that might look like in an IDE.  Focus on the top-left of the screenshot below.

image

The first level of darker text (“Silverlight Client” and “Cloud Services”) are equivalent to “solution folders” in Visual Studio.  They’re labels that can be nested like folders for organizational purposes.  Within each of these areas is a collection of Deployment definitions.  The expanded deployment is for the Shell of our Silverlight application.  The only child of this deployment is a location.

In a desktop application, you might have multiple deployment locations, such as $AppDir$, $AppDir$\Data, or $UserDir$\AppName, each with child nodes representing content to be deployed to those locations.  In Silverlight, however, it doesn’t make sense to deploy to a specific folder since that’s abstracted away from you.  So for this example, the destination is Shell.XAP.

You’ll notice that multiple assemblies are listed.  If this were a web application, you might have a number of loose files as well, such as default.aspx or web.config.  If such files were listed under that deployment, you could double-click one to open and edit in the editor on the right-hand side of the screen.

The nice thing about this setup is the complete transparency: if a file is listed in a deployment path, you know if will be copied to the output directory before debugging begins.  If it’s not listed, it won’t get deployed.  It’s that simple.

The next question you might have is: doesn’t this mean that I have a lot of extra work to manually add each of these assembly files?  Especially when it comes to including the necessary references, nobody wants the additional burden of having to manually drag every needed reference into a deployment definition.

This is pretty easy to deal with.  When you add a reference to an assembly, and that referenced assembly isn’t in the .NET Framework (those are accessed via the GAC and therefore don’t need to be included), the IDE can add that assembly to the deployment definition for you.  Additionally, it would be helpful if all referenced assemblies lit up (with a secondary highlight color) when a referencing assembly was selected in the list.  That way, you’d be able to quickly figure out why each assembly was included in that deployment.  And if you select an assembly that requires a missing assembly, the name of any missing assemblies should appear in a general status area.

What we end up with is a more explicit and transparent way of dealing with deployment definitions separately from assembly definitions, a clean separation of concepts, and direct control over deployment behavior.  Because deployment intent is specified explicitly, this would be a great starting point for installer technologies to plug into the IDE.

In Visual Studio, a project maps many inputs to many outputs, and confuses deployment and assembly definitions.  A Visual Studio “project” is essentially an “input” concept.  In the approach I’ve outlined here, all definitions are “output” concepts; in other words, items in the proposed solution hierarchy are defined in terms of intended results.  It’s always a good idea to “begin with the end in mind” this way.

Multiple Solution Views

In the screenshot above, you’ll notice there’s a dropdown list called Solution View.  The current view is Deployment; the other option is Assembly.  The reason I’ve included two views is because the same assembly may appear in multiple deployments.  If what you want is a list of unique assemblies, that alternative view should be available.

A New Template System

The other redesign required is around the idea of Visual Studio templates.  Instead of solution, project, and project item templates in Visual Studio, you would have four template types: solution, deployment, assembly, and file.  Consider these examples:

Deployment Template: ASP.NET Web Application

  • $AppDir$
    • Assembly: MyWebApp.dll
      • App.xaml.cs
      • App.xaml    (embedded resource)
      • Main.xaml.cs
      • Main.xaml   (embedded resource)
    • File: Default.aspx
    • File: Web.config
    • Folder: App_Data
      • File: SampleData.dat

Solution Template: Silverlight Solution

  • Deployment: Silverlight Client
    • MySLApp.XAP
      • Assembly: MyClient.dll
        • App.xaml.cs
        • App.xaml    (embedded resource)
        • Main.xaml.cs
        • Main.xaml   (embedded resource)
  • Deployment: ASP.NET Web Application
    • $AppDir$
      • Assembly: MyWebApp.dll
        • YouGetTheIdea.cs
      • Folder: ClientBin
        • MySLApp.XAP (auto-copied from Deployment above)
      • File: Default.aspx
      • File: Web.config

Summary

In this article, we explored several features in modern IDEs (Visual Studio specifically), and some of the ways in which imaginative rethinking could bring substantial improvements to the developer experience.  I have to wonder how quickly a large ship like Visual Studio (with 1.5 million lines of mostly C++ code) could turn and adapt to new ideas like this, or whether it makes sense to start fresh without all the burden of legacy.

Though I have many more ideas to share, especially regarding the build system, multiple-language name resolution and refactoring, and IDE REPL tools, I will save all of that for future articles.

Posted in Cloud Computing, Development Environment, Silverlight, User Interface Design, Visual Studio, Windows Azure | Leave a Comment »

Cloud Slam ‘09 Conference

Posted by Dan Vanderboom on April 14, 2009

If you’re interested in Cloud Computing, you should consider signing up for Cloud Slam, a very inexpensive four-day virtual conference.  You can attend from the comfort of your home (or local wine or coffee shop), and have access to about 100 hours of sessions for only $52.  It goes from April 20-24.

Speakers include such industry leaders as Stephen Herrod, CTO of VMWare, Simon Crosby, CTO of Citrix Systems, Werner Vogels, CTO of Amazon.com, and many more.

I can’t say I’ll see you there, but I’m definitely looking forward to it.  It should be a great source of information for what industry leaders are thinking and where cloud computing is headed.

Posted in Cloud Computing, Conferences | Leave a Comment »

Windows Azure: Blobs and Blocks

Posted by Dan Vanderboom on February 21, 2009

I’ve been busy building a new cloud-based service for the past few weeks, using Windows Azure on the back end and Silverlight for the client.  One of the requirements of my service is to allow users to upload files to a highly scalable Internet storage system.  I’m experimenting with Azure’s blob storage for this, and I have a need to upload these blobs (Binary Large OBjects) in separate blocks.  There are two reasons I can tell why you’d want to do this:

  1. Although blobs can be as large as 2 GB in the current technical preview, the largest blob you can put in one operation is 4 MB.  If your file is larger, you have to store separate blocks, and then put a block list to assemble them together and commit them as a blob.
  2. If you want different users to upload different portions of a file, each user will have to upload individual blocks, and you’ll have to put the block list when all blocks are present.  This is something like a reverse BitTorrent or other P2P protocol.

My service needs to deal with separate blocks for the second reason, though the first is likely to be much more common.

Although there’s a good deal of information about blocks and blobs in the REST API for Azure Storage Services, piecing together code to make REST calls with all the appropriate headers (including authentication signatures) isn’t very fun.  Where is the .NET library to make it easy?

There is one, in fact.  If you’ve downloaded and installed the Azure SDK (Jan 2009), you’ll find a samples.zip file that needs to be unzipped, and the solutions built within it.  Particularly, you’ll need the StorageClient solution.  In it, you’ll find that you can save and load blobs (as well as use queues and table storage), but there’s nothing in the API that suggests it supports putting individual blocks, let alone putting block lists to combine all of those blocks into a blob.  The raw state of this API is unfortunate, but the Azure platform is in an early tech preview stage, so we can expect vast improvements in the future.

Until then, however, I dug into it and discovered that there actually was code to put blocks and commit block lists, but it wasn’t exposed in the API (in BlobContainerRest.PutLargeBlobImpl).  Rather, it was called only when the blob you try to put was over the 4 MB limit.  Taking this code and hacking it a bit, I extended the StorageClient library to provide this needed functionality.

First, add these abstract method definitions to the BlobContainer class (in BlobStorage.cs):

public abstract bool PutBlobBlockList(BlobProperties blobProperties, 
    IEnumerable<string> BlockIDs, bool overwrite, string eTag);

public abstract bool PutBlobBlock(BlobProperties blobProperties, string BlockID, 
    Stream stream, long BlockSize, bool overwrite, string eTag);

Next, you’ll need to add the implementations to the BlobContainerRest class (in RestBlobStorage.cs):

public override bool PutBlobBlock(BlobProperties blobProperties, string BlockID, 
    Stream stream, long BlockSize, bool overwrite, string eTag)
{
    NameValueCollection nvc = new NameValueCollection();
    nvc.Add(QueryParams.QueryParamComp, CompConstants.Block);
    nvc.Add(QueryParams.QueryParamBlockId, 
        Convert.ToBase64String(Encoding.Unicode.GetBytes(BlockID)));
    return UploadData(blobProperties, stream, BlockSize, overwrite, eTag, nvc);
}

public override bool PutBlobBlockList(BlobProperties blobProperties, 
    IEnumerable<string> BlockIDs, bool overwrite, string eTag)
{
    bool retval = false;

    using (MemoryStream buffer = new MemoryStream())
    {
        XmlTextWriter writer = new XmlTextWriter(buffer, Encoding.UTF8);
        writer.WriteStartDocument();
        writer.WriteStartElement(XmlElementNames.BlockList);
        foreach (string id in BlockIDs)
        {
            writer.WriteElementString(XmlElementNames.Block, 
                Convert.ToBase64String(Encoding.Unicode.GetBytes(id)));
        }
        writer.WriteEndElement();
        writer.WriteEndDocument();
        writer.Flush();
        buffer.Position = 0; //Rewind

        NameValueCollection nvc = new NameValueCollection();
        nvc.Add(QueryParams.QueryParamComp, CompConstants.BlockList);

        retval = UploadData(blobProperties, buffer, buffer.Length, overwrite, eTag, nvc);
    }

    return retval;
}

In order to test this, I added two buttons to an ASP.NET page, one to upload the blocks and put the block list, and a second to read the blob back to verify the write operations worked:

protected void btnUploadBlobBlocks_Click(object sender, EventArgs e)
{
    var account = new StorageAccountInfo(new Uri("http://127.0.0.1:10000/"), null, "devstoreaccount1", 
        "Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==");
    var storage = BlobStorage.Create(account);
    var container = storage.GetBlobContainer("testfiles");

    if (!container.DoesContainerExist())
        container.CreateContainer();

    var properties = new BlobProperties("TestBlob");

    // put block 0

    var ms = new MemoryStream();
    using (StreamWriter sw = new StreamWriter(ms))
    {
        sw.Write("This is block 0.");
        sw.Flush();
        ms.Position = 0;

        var PutBlock0Success = container.PutBlobBlock(properties, "block 0", ms, ms.Length, true, null);
    }

    // put block 1

    ms = new MemoryStream();
    using (StreamWriter sw = new StreamWriter(ms))
    {
        sw.WriteLine("... and this is block 1.");
        sw.Flush();
        ms.Position = 0;

        var PutBlock1Success = container.PutBlobBlock(properties, "block 1", ms, ms.Length, true, null);
    }

    // put block list

    List<string> BlockIDs = new List<string>();
    BlockIDs.Add("block 0");
    BlockIDs.Add("block 1");

    var PutBlockListSuccess = container.PutBlobBlockList(properties, BlockIDs, true, null);
}

protected void btnTestReadBlob_Click(object sender, EventArgs e)
{
    var account = new StorageAccountInfo(new Uri("http://127.0.0.1:10000/"), null, "devstoreaccount1",
        "Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==");
    var storage = BlobStorage.Create(account);
    var container = storage.GetBlobContainer("testfiles");

    MemoryStream ms = new MemoryStream();
    BlobContents contents = new BlobContents(ms);
    container.GetBlob("TestBlob", contents, false);
    ms.Position = 0;

    using (var sr = new StreamReader(ms))
    {
        string x = sr.ReadToEnd();
        sr.Close();
    }
}

It’s nothing fancy, but if you put a breakpoint on the last sr.Close command, you’ll see that the value of x contains both blocks of data, equal to “This is block 0…. and this is block 1.”

Posted in Cloud Computing, Design Patterns, Windows Azure | 5 Comments »