Why Oslo is Important
Posted by Dan Vanderboom on January 17, 2009
Contrary to common misunderstanding and speculation, the point of Oslo is not to put programming in the hands of business analysts who want to write their own business rules. Do I think some of that will happen? Architects and engineers will try everything they can imagine. Some of them will succeed in specific niches or scenarios, but it won’t replace application or system design, and it will probably be very limited for the forseeable future. Oslo is more about dramatically improving the productivity of designers and developers by generalizing common solution patterns and generating more adaptable tools.
Much of the confusion around Oslo occurs for two reasons:
- Oslo is designed at a higher level of abstraction than most systems today, so its scope is broad and it will have an impact on virtually every product, solution and service across Microsoft. It’s difficult to get your head around something that big.
- Because of its abstract nature, core concepts are defined in terms that are heavily overloaded, like "Model", "Repository", and "Language". Once you’ve picked up the lingo and can translate Oslo terminology into language you’re already familiar with, both the concept and magnitude of it will become obvious.
Oslo isn’t something completely new; in fact, Oslo borrows from a lot of previous research and even existing model-driven development tools. Oslo focuses existing technologies and techniques into a coherent and mature vision of development, combining all parts into a more powerful whole, and promises to deliver a supremely adaptable and efficient platform to develop on.
What Is Oslo?
Oslo is a software factory for generating first-class, tool-supported languages out of your declarative specifications.
A factory is a highly organized production facility
that produces members of a product line
using standardized parts, tools and production processes.
-from a review of Software Factories
The product line is analogous to Oslo’s parsers, transform tools, and IDE plugins for new data models and languages (both textual and visual) that you define. The standardized parts are Oslo’s library components; the tools are the M languages and the Quadrant/Intellipad application; and the processes are shaped by the flow of data through the Oslo tool chain (see the diagram near the end of this article).
With Oslo, you build the custom tools you need to rapidly build or generate software systems. It’s all about using the right tool for the job, and having a say in how those tools are shaped to obtain the greatest leverage.
As stated at the home page of softwarefactories.com:
We see a capacity crisis looming. The industry continues to hand-stitch applications distributed over multiple platforms housed by multiple businesses located around the planet, automating business processes like health insurance claim processing and international currency arbitrage, using strings, integers and line by line conditional logic. Most developers build every application as though it is the first of its kind anywhere.
In other words, there’s already a huge shortage of experienced, highly-qualified professionals capable of ensuring the success of these increasingly complex systems, and with the need (and complexity) growing exponentially, our current development practices increasingly fall short of the total demand.
Books like Greenfield’s Software Factories have been advocating building at a higher level of abstraction for years, and my initial reaction was to see it as a natural, evolutionary milestone for a highly mature software system. However, it’s an awful lot of focused development effort to attain such a level of maturity, and not many organizations are able to pull it off given the state of our current development platforms.
It’s therefore fortuitous that Microsoft teams have taken up the challenge of building these abilities into their .NET platform. After all, that’s where it really belongs: in the framework.
Oslo of course contains a lot of expected awesomeness, but where it will probably have the most impact in terms of developer productivity is with new first-class languages and language tools. Why? It first helps to understand the world of data formats and languages.
We’ve had an explosion of data formats–these mini Domain Specific Languages, if you will (especially in the form of complex configuration files). As systems evolve and scale, and the ways we can configure and compose our application’s behavior continues to grow, at what point do we perceive that configuration graph as the rich language that it becomes? Or when our user interfaces evolve from Monolithic to Modular to Composite to Granular Composite (or User Composable), at what point does that persistent object graph become our UX DSL (as with XAML in WPF).
Sometimes we set our standards too low, or are slow to raise them when the time has come to do so. With XML we get extensibility in defining languages and we think, "If we can parse it, then we can build a tool over it." I don’t know about you, but I’d much rather work with rich client software–some kind of designer–over a textual data format any day.
But you know how things go: some company like Microsoft builds a whole bunch of cool stuff, driven off some XML configuration, or they unleash something like XAML on which WPF, WF, and more are built. XAML is great for tools to read and write, and although XML and XAML are textual and not binary and therefore human readable in a text editor (the original intention behind that term), it’s simply not as easy to read as C# or VB.NET. That’s why we aren’t all rushing to program everything in XAML.
Companies like Microsoft, building from the bottom up, release their platforms well in advance of the thick client user experiences that make them enjoyable to use and which encourages mass adoption. Their models, frameworks, and applications are so large now that they’re released in massively differentiated stages, producing a technology adoption gap.
By giving that language a syntax other than XML, however, we can approach it in the same way we approach our program logic: in the most human readable and aesthetically-pleasant way we can devise, resembling our programming languages of choice.
Sometimes, the density of data and its structure in our model is such that a visual editor fails to represent that model well. Source code is a case in point. You could create a visual designer to visualize flow control, branching logic, and even complex expression building (like the iTunes Smart Playlist), but code in text format is more appropriate in this kind of scenario, and ends up being more efficient with the existing tooling available. Especially with an IDE like Visual Studio, we’re working with human-millenia of effort that have gone into the great code editing tools we use today. Oslo respects this need for choice by offering support for building both visual and textual DSLs, and recognizes the fluent definition of new formats and languages as the bridge to the next quantum leap in productivity.
If we had an easy way of defining languages in formats that we developers felt comfortable working with–as we’re comfortable with our general purpose languages and their rich tool support–then we’d be much more productive in the transition between a technology first being released and later having rich tool support over it. WPF has taken quite a while to be adopted as much as it has, partly due to tool availability and maturity. Before Expression Blend or Cider designers were released and hand-coding XAML was the only way, those who braved the angle brackets struggled with it. As I play with Silverlight, I realize how much must still be done in XAML, and how we still struggle. It’s simply not as nice to work with as my C# code. Not as rich, and not as strongly tool-supported.
That’s one place Oslo provides value. With the ability to define new textual and visual DSLs, rigorous verification and validation in a rich set of tools, the promise of Intellisense, colorization of keywords, operators, constants, and more, the Oslo architects recognize the ability to enhance our development experience in a language-agnostic way, raising the level of abstraction because, as they say, the way to solve any technical problem is to approach it at one higher level of indirection. Unfortunately, this makes Oslo so generalized and abstract that it’s difficult to grasp and therefore to appreciate its immensity. Once you can take a step back and see how it fits in holistically, you’ll see that it has the potential to dramatically transform the landscape of software development.
Currently, it’s a lot of work to implement all the language services in Visual Studio to give them as rich an experience as we’ve come to expect with C#, VB.NET, and others. This is a serious impediment to doing this kind of work, so solving the problem at the level of Oslo drastically lowers the barrier to entry for implementing tool-supported languages. The Oslo bits I’ve seen and played with are very early in the lifecycle for this massive scope of technology, but the more I think about its potential, the more impressed I am with the fundamental concept. As Chris Anderson explained in his PDC session on MGrammar, MGrammar was an implementation detail, but sometime around June 2007, that feature team realized just how much customers wanted direct access to it and decided to release MGrammar to the world.
Modeling & The Repository
That’s all well and good for DSLs and language enthusiasts/geeks, but primarily perhaps, Oslo is about the creation, exploration, relation, and execution of models in an interoperable way. In other words, all of the models that are currently used to describe a software system, or an entire IT environment, are either not encoded formally enough to verify or execute, or they’re encoded or stored in proprietary ways that don’t allow interoperability with other models. A diagram in Visio or PowerPoint documenting network topology, for example, knows nothing about the component architecture or deployment model of the software systems installed and running on that network.
When people usually talk about models, they imagine high-level architecture documents, overviews used to visually summarize work that is much more granular in nature. These models aren’t detailed, and they normally aren’t kept up to date and in sync with the current design as changes are made. But modeling in Oslo is not an attempt to make these visual models contain all of the necessary detail, or to develop software with visual tools exclusively. Oslo simply provides the tools, both graphical and textual, to define and relate many models. It will be up to the development community to decide how all these tools are ultimately used, which parts of our systems will be specified in a mix of general purpose, domain specific, and visual languages. Ultimately, Oslo will provide the material and glue to fill the gaps between the high and low level specifications, and unite them into a common, connected, and much more useful set of data.
To grasp what Oslo modeling is really all about requires that we expand our definition of "model", to see the models expressed in our configuration and XAML files, in our applications’ database schemas, in our entity classes, and so on. As software grows in complexity and becomes more composable, we can use various languages to model its behavior, store that in the repository for runtime execution, inspection, or reuse by other systems.
If we had some universal container for the storage of all different kinds of models, and a standardized way of relating entities across models, we’d be able to do things like impact analysis, where we could see the effect on software systems if someone were to alter the network it was running on; or powerful data mining on the IT execution environment of a business.
Many different tools, with different audiences, will be able to connect into this repository to manipulate aspects of the models that they understand and have access to. This is just the tip of the iceberg. We already model so much of what we do in the IT and software worlds, and as we begin adopting business process middleware and orchestration software like BizTalk, there’s a huge amount of value in those models converging and connecting. That’s where the Oslo Repository comes in.
Oslo provides interoperability among models in the same way that SOA provides interoperability among services. Not unlike the interoperability we have now among many different languages all sharing the same CLR specification.
Bridging data models across repositories or in shared repository is a major step forward. With Windows Azure and Microsoft’s commitment to their online services platform (and considering the momentum of the SaaS movement with Amazon, Google, and others), shared storage and data sets are the future. (Check out SQL Data Services if you haven’t already, and watch for some exciting announcements coming later this year!)
The Dichotomy of Data vs. Metadata
Jeff Pinkston from the Oslo team aptly reflects the attitude of the group when he scoffs at the categorical difference between data and metadata. In terms of storing and querying it, serializing and communicating it, and everything else that matters in enterprise software, data is data and there’s no reason not to treat it the same when it comes to architecting a system. We have our primary models and our secondary models, our shared models and our protected models, but they’re still just models that shape our software’s behavior, and they share all of the same characteristics when it comes to manipulation and access. It’s their ultimate effect that differs.
It’s worth noting, I think, the line that’s been drawn between code and data in some programming languages and not in others (C# vs. LISP). A division has been made for the sake of security rather than necessity. Machine instruction codes are represented in the same sort of binary data and realized in the same digital circuitry as traditional user data. It’s tempting to keep things locked down and divided, but as languages evolve to become more late bound and dynamic (and as the tools evolve to make this feasible), there will be more need for the manipulation of expression trees and ASTs. I strongly suspect the lines will blur until they disappear.
Schema and Object Instance Languages
In order to define models, we need a tool. In Oslo, this is a textual language called MShema and an editor called Intellipad. I personally think it’s odd to talk people’s ears off about "model, model, model", and then to use the synonym "schema" to name the language, but all of these names could change before they’re shipped for all we know.
This is a simple example of an MSchema document:
LastName : Text;
FirstName : Text;
People : Person*;
By running this through the "M Compiler", a SQL script is generated that will create the appropriate database objects. Intellipad is able to verify the correctness of your schema, and what’s really nice is that you don’t even have to specify data types when you start sketching out your model. Defaults are assumed, and you can get more specific as your model evolves.
MGraph is a language for defining instances of objects, constrained by an MSchema and similar in format. So MSchema is to MGraph what XSD is to XML.
In this article, Lars Corneliussen explains Microsoft’s vision to make MGraph as common as XML is today. Take a look at his article to see a side-by-side comparison of the same object represented as XML (POX), JSON, and MGraph, and decide for yourself which you like best (or see below).
MSchema and MGraph are easier and more efficient to read and write than XML. Their message format resembles typical structured programming languages, and developers are already familiar with these formats. XML is a fine format for a tool; it’s human readable but not human-friendly. A C-style language, on the other hand, is much more human-friendly than all of the angle brackets and the redundancy (and verbosity) of tag text. That narrows down our choice to JSON and MGraph.
In JSON, the property/field/attribute names are delimited by quotation marks, suggesting that the whole structure is a dumb property bag.
"LastName" : "Vanderboom",
"FirstName" : "Dan"
MGraph has a very similar syntax, but its attribute property names are recognized and validated by the parser generated from MSchema, so the quotation marks are unnecessary. It ends up looking more natural, and a little more concise.
LastName : "Vanderboom",
FirstName : "Dan"
Because MGraph is just a message format, and Microsoft’s service offerings already support multiple message formats (SOAP/POX/JSON/etc.), it wouldn’t disrupt any of their architecture to add an MGraph adapter, and I’ll be shocked if I don’t hear about one in their next release.
Meta-Languages and MGrammar
In the same way that Oslo includes a meta-model because it allows us to define models, it also includes a meta-language because it allows us to define languages (as YACC and ANTLR have done). However, just as Pinkston doesn’t think data and metadata should be treated different, it makes sense to think of a language that defines languages as just another language. There is something Zen about that, where the tools somehow seem to bend back upon themselves like one of Escher‘s drawings.
Here is an example language defined by MGrammar in a great article on MSDN called MGrammar in a Nutshell:
token Rest = "-";
token Note = "A".."G";
token Sharp = "#";
token Flat = "b";
token RestOrNote = Rest | Note (Sharp | Flat)?;
syntax Bar = RestOrNote RestOrNote RestOrNote RestOrNote;
= e:element => [e]
| es:List(element) e:element => [valuesof(es), e];
// One or more bars (recursive technique)
syntax Bars = bs:List(Bar) => Bars[valuesof(bs)];
syntax ASong = Music bs:Bars => Song[Bars[valuesof(bs)]];
syntax Songs = ss:List(ASong) => Songs[valuesof(ss)];
// Main rule
syntax Main = Album ss:Songs => Album[ss];
syntax Music = "Music";
syntax Album = "Album";
// Ignore whitespace
syntax LF = "\u000A";
syntax CR = "\u000D";
syntax Space = "\u0020";
interleave Whitespace = LF | CR | Space;
This is a pretty straight forward way to define a language and generate a parser. Aside from the obvious keywords to define syntax rules and token patterns (with an alternative and more readable format for regular expressions), the => projection operator allows you to shape the MGraph output according to your needs.
I created two simple languages with MGrammar on the plane trip back to Milwaukee from the PDC in November. The majority of my time was spent fussing with the editor, Intellipad, and for the last half hour I found it very easy to create a language on the fly, extending and changing it through experimentation quickly and easily. Projections, which are functional expressions in MGrammar used to shape MGraph output, are the most challenging part. There are a number of techniques that shape the output graph, so it will be good to see how this is approached in future reference examples.
Surreptitiously announced just before I wrote this, Mike Weinhardt at Microsoft indicated that a gallery of example grammars for MGrammar is being put together, to point to the sample grammars for various languages in addition to grammars that the community develops, and it should be available by the end of this month. These examples demonstrating how to define languages and write sensible projections, coming from the developers who are putting MGrammar together, will be an invaluable tool for teaching you how to use common patterns (just as 101 LINQ Samples did for LINQ).
As Doug Purdy explained on .NET Rocks: "People who are building a domain specific language, and they don’t want to understand how to build a parser, or they’re not language designers. Actually, they are language designers. They design a language, but they actually don’t do the whole thing. They don’t build a parser. What they do, they just leverage the XML parser. And what we’re trying to do is provide a toolset for folks where they don’t have to resort to XML in order to do DSLs."
From the same episode, Don Box said of the DSL session at PDC: "I’ve never seen a session with more geek porn in it."
Don: "It’s like crack for developers. It’s kind of addictive; it takes over your life."
Doug: "If you want the power of Anders in your hand…"
The Tool Chain
Now that we have a better sense of what’s included in Oslo in terms of languages, editors, and the shared repository, we can look at the relationship among the other pieces, which are manifested in the CTP as a set of command-line tools. In the future, these will integrate into an IDE, most likely Visual Studio. (I’d expect Intellipad and Quadrant to merge with Visual Studio, but there’s no guaranty this will happen.)
When you create your model with MSchema, you’ll use m to validate that model and generate a SQL script to create a SQL Server 2008 database schema (yes, it only works right now with SQL Server 2008). You’ll also use the m command to validate your object graph (written in MGraph) against your schema, and translate that into a set of SQL commands to perform inserts and updates against tables.
With enough models, there’ll be huge value in adding yours to the repository. If you don’t mind writing MGraph or you generate it automatically with something like an MGraphSerializer class in your code, this may be all you need.
If, on the other hand, you decide you could really benefit by defining your own textual language to use instead of MGraph, you can use MGrammar to define a new language. This language gets compiled by the mg compiler to create your parser, and the mgx command translates code in your new language into an MGraph, which can then be pulled into your database using m.
This diagram depicts the process:
Other than these command-line tools, Quadrant is the highly extensible visual tool for exploring models graphically, and Intellipad is a different face on the same shell for defining DSLs with MGrammar and writing DSL code, as well as writing and verifying MSchema and MGraph code.
We should see fairly soon the convergence of these three languages (MGraph, MSchema, and MGrammar) into a single M language. This makes sense, since what you want to project in your DSL should be something within your model, verified by your schema. This may ultimately make these projections much easier to write.
We’ll also see this tool chain absorbed into multiple development environments, eventually with rich binding across multiple representations of our model, although this will take longer in Visual Studio.
Languages and Nested Languages
I looked at some MService examples, and I can understand Damon’s concern that although it’s nice to have "operation" as a keyword in a service-oriented language, with more keywords giving you the ability to specify aspects of each endpoint and the communications patterns required, that enclosing the business logic within that service language is probably not a good idea. I took this from Dennis van der Stelt’s blog:
operation PhotoUpload(stream : Stream) : Text
.PostUriTemplate = "upload";
index : Text = invoke DateTime.Now.Ticks.ToString();
filename : Text = "d:\\demo\\photo\\" + index + ".jpg";
invoke MService.ServiceHelper.StoreInFile(stream, filename);
Why not? You’re defining a general purpose language within the curley braces, one capable of defining variables, assigning values, referencing .NET objects, and calling methods. But why do you want to learn a new language to write services when the language you’re using right now is already supremely capable of that? Don’t you already know a good syntax for invoking methods (other than "invoke %mehthod%")? If instead you simply referenced an assembly, type, and method from an MService script, you could externally turn any .NET method with serializable parameters and return value into a service operation by feeding it this kind of file, without having to recompile, and without having to reinvent the wheel.
The possible exception would be if MGrammar adds the ability (as discussed by speakers at the PDC) of supporting multiple layers of enclosing languages within other languages. In other words, you could use MService to define operations and their attributes using its own syntax, and within the curly braces that follow, use the C# or VB.NET parsers to process the logic with the comprehension of a separate language. There are some neat possibilities here, but I expect the development community to be conservative and hesitent about mixing layers of semantics, as there is an awful lot of room for confusion and complexity. It may be better to leave different language blocks in separate files or containers, and to allow them to reference each other as .NET assemblies and XML files reference each other today.
However, I wouldn’t get too hung up on the early versions of these new languages, or any one language specifically. The useful, sensible ones that take real developer needs into account and provide the most value will be adopted, and many more will quickly fall into disuse. But the overall pattern will be for the emergence of an amazing amount of leverage in terms of improving human comprehension and taking advantage of our ability to manipulate structured, symbolic object graphs to build and verify software systems.
After a few months of research and many hours of writing, I don’t feel like I’ve even scratched the surface. But instead of giving you an absolutely comprehensive picture, I’m going to stop here and continue in future articles. In the meantime, check out the following resources.
For an overview of the development paradigm, look for information on language-oriented programming, including an article I wrote that alludes to how "we will have to raise the level of abstraction to a point that may be hard for us to imagine with our existing tools and languages" due to the "precipitious growth of software complexity". The "community of abstractions" is the model in Oslo-speak.
For Microsoft specific content: there were some great sessions at the PDC (watch the recorded videos). It was covered (with much confusion) on the .NET Rocks! podcast (here and here) as well as on Software Engineering Radio; and there are lots of bloggers talking about their initial experiences with it, such as Shawn Wildermuth, Lars Corneliussen, and of course Chris Sells and Jeff Pinkston. The most clear and coherent explanation I’ve heard was from an interview with Ron Jacobs and David Chappell (Ron gave the keynote at MSDN Dev Con, hosted the ARCast podcast for years). MSDN has at least 29 videos on the Oslo Developer Center, where there’s a good amount of information. including a FAQ. There’s also the online guide for MGrammar, MGrammar in a Nutshell, and the Oslo team blog.
If you’re interested in creating DSLs, make sure to keep a look out for details about the upcoming DSL Developers Conference, which is tentatively planned for April 16-17, immediately following the Lang.NET conference (on general purpose languages) on April 14-16. I’m hoping to be at both this year. And in case you haven’t heard, Microsoft is planning another PDC Conference for 2009, the first time ever these conferences have run for two consecutive years! There will no doubt be much more Oslo news and conference material to cover it at the PDC in November.
The best way to learn about Oslo, however, is to dive in and use it. That’s what I’m doing with my newest system, which needs to be modeled from scratch. So if you haven’t done so already, download the Oslo SDK (link updated to January 2009 SDK) and introduce yourself to the future of modeling and development!
This entry was posted on January 17, 2009 at 5:00 pm and is filed under Data Structures, Development Environment, Distributed Architecture, Language Extensions, Language Innovation, Metaprogramming, Oslo, Problem Modeling, Service Oriented Architecture, Software Architecture, SQL Data Services, Visual Studio, Windows Azure. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.