…using terminology and examples from the Concurrency & Coordination Runtime in C#
Port-based messaging has been in practice for a long time. In the realm of low-level electronics, components have always been operating in parallel, hardware interface ports are designed around standards, and messages are posted to those ports, queuing up somewhere until the receiver is ready to process them. This pattern has worked extremely well in multiple domains for a long time, and its key characteristic is the decoupling of data flow from control flow.
In sequential programs, one statement runs after another each time it’s run, and the behavior is predictable and repeatable. Concurrency is difficult because you have to consider all possible interleavings of multiple simultaneous tasks, so the overall behavior is nondeterministic. Depending on the relative timings of concurrent tasks, you could get different results each time if you’re not careful to set the appropriate locks on shared resources. Port-based messaging architectures isolate islands of state across different execution contexts, and connect them with safe, atomic messages delivered through pre-defined ports.
The posting of a message to a port, as shown in Figure 1, is followed by some handler method that is receiving and processing messages. What’s not evident in the diagram, however, is that while data flows into the port, that posting is a non-blocking call. The sender continues on doing other work, taking the time only to queue up a message somewhere.
Queuing is important because, even with large thread pools, we can’t guaranty that a receiver will be listening at the very moment a message arrives. Letting them queue up means the receiver doesn’t have to block on a thread to wait. Instead, the data waits and the thread checks messages on the port when it can.
What is a Port?
So what exactly is a port? A port is a communication end point, but not in the sense of “a web service on a physical server”. Think much more fine grained than that, even more fine-grained than methods. With sequential programming, we commonly use try-catch blocks and handle both the exceptional and non-exceptional results of operations within a single method. In port-based programming, you post a message to a port, which results in some handler method running on the receiving end, and different results can be sent back on different callback ports depending on the type of message. Instead of calling a method that returns back to you when it ends, port-based programming is about always moving forward, and unwinding a call stack has very little meaning here.
We can see in the sequence diagram above (Figure 3) a collection of services that communicate with and depend on each other. Starting from the top, the left-most service posts a message to port 1, and then goes on to do other work or surrenders its thread back to the dispatcher for other tasks that are waiting to run. A registered method on port 1 runs, and the logic there needs another service to complete it’s task, so it posts a message on port 2, and also continues processing without waiting. The path of solid blue arrow lines traces the message path for normal execution. If anything goes wrong, an exception can be posted to a different callback port, shown with a red outline in the diagram.
This diagram shows one possible composition of services and data flow. Port sets, which are simply a collection of related ports, are shown as callback receivers in pairs, but they can consist of any number of ports with any mixture of messages types, depending on the needs of the system being coordinated. In this example, if anything goes wrong in the handler methods at ports 2, 5, or 6, an exception message will be routed to port 6, where another handler method can compensate for or report on the error. Also note that while during startup this system may have to process data at port 4 before the logic at ports 5, 7, and 8 can run… once it gets going, there could be activity operating at many ports concurrently (not just one port per service).
Arbiters, Dispatchers, and DispatcherQueues
Now it’s time to peel away some of the layers of simplification presented so far. (It may help to have a beer or glass of wine at this point.)
An arbiter is a rule (or set of rules) about when and how to process messages for a specific port (or set of ports). (It is helpful to think of arbiter as a data flow or message flow coordinator.) Should messages be pulled off the queue as soon as they arrive? Should the software wait until 5 messages have arrived before processing them all as a group? Should messages be checked according to a timer firing every 20 seconds? Should logic be run only when two ports have messages waiting (a join)? What logic should be run when one of these conditions occurs? Can method handlers on three specific ports run concurrently until a message arrives on a fourth port, whose handler must run exclusively, and when done the other three can run again (interleave)? These are just a few of the many coordination patterns that can be expressed with different types of arbiters (and hierarchically nested arbiters, which are ingenious).
Figure 4 illustrates that an arbiter is associated with a port to monitor and a method to execute under the right conditions. The logic of the arbiter, depending on its type and configuration, determines whether to handle the message. It gets its thread from a thread dispatcher, which contains a thread pool. (Not the same as System.Threading.ThreadPool, though, as there can only be one of those per process.)
The next diagram (figure 5) could represent a join coordination. An arbiter waits for messages on two ports, and depending on how it’s defined, it may process messages from one port repeatedly, but as soon as it receives a message on the second port (it may be an exception port, for example), the whole arbiter might tear itself down so that no more handling on those port will occur. As you are probably starting to see, composition and attachment of arbiters are key to controlling specific patterns of coordination in arbitrarily powerful and flexible ways.
In the Concurrency & Coordination Runtime, we can attach and detach these arbiters during runtime; we don’t have to define them statically at compile time. There has been some criticism towards this approach because dynamic arbitration rules are much more difficult to verify formally with analysis, and are therefore difficult to optimize compilation and thread management for, but the advantages of this flexibility are enormous and the performance (see this paper by Chrystanthakopoulos and Singh) has been very impressive compared to conventional multithreading approaches. Ultimately, it’s not about whether we can guaranty 100% that nothing will go wrong using only the mathematical models currently in our repertoire, but whether we can be productive with these techniques to release software that meets acceptable quality standards across a broad range of application domains.
I don’t think we’re going to find a better set of strategies to work with anytime soon, and when we’ve pushed this technology hard enough, the tactics will be fine tuned and we can start absorbing some of these coordination concepts into languages themselves (without sacrificing the dynamism that a library of composable parts provides). People are going to attempt concurrent programming whether it’s safe or not, and using a library such as the CCR significantly reduces the risk of ad hoc multi-threading code.
When mainstream operating systems like Windows took their first steps to support multi-tasking, cooperative versus preemptive multi-tasking was a common topic. The idea of an operating system depending on applications to surrender control in a well-behaved way was generally and rightfully considered a bad idea. Any kind of error or instability in software could easily bring down the entire operating system, and enforcing a quickly growing community of software vendors to share nicely wasn’t a realistic option. Being preemptive meant that the OS could forcefully stop an application from running after giving it a small, measured slice of time, and then switch the thread to a new context where another application could run for another time slice. Regardless of how poorly applications ran, as benevolent dictator, the OS could ensure a fair scheduling of processor time.
The solution encapsulated in the Concurrency & Coordination Runtime is, on the other hand, a cooperative multi-tasking strategy. However, because it operates within the local scope of an individual process, and is isolated from other processes in the same OS, its risk of destabilizing the system is nonexistent. This deep level of cooperation, in fact, is what gives the CCR its great performance. When used correctly, which George Chrysanthakopoulos (in this video) and his colleagues have brilliantly put within our reach in the CCR library, threads don’t sit around waiting on some resource or for long-running operations to complete; instead, control is freely surrendered back to the thread pool, where it is quickly assigned to a new task.
Finally, by surrendering threads freely instead of holding onto them, a continuous flow of threads through the different activities of the system is maintained, and there is therefore always an abundance of them to handle new messages waiting on ports. Existing threads are not wasted. As the Tao Te Ching says:
If you want to accord with the Tao,
just do your job, then let go.
Control & Data Flow: Sequential vs. Concurrent
In sequential programs, stacks are used to unwind method calls and provide return values (return messages), and threads follow the data flow; whereas in port-based programming, threads are managed by one or more thread dispatchers that are capable of maximizing the use of that thread by making it available in a pool and sharing it with with many other (potentially unrelated) tasks. Data flows orthogonally and according to a different coordination strategy than control flow. This task-thread agnosticism (the opposite of thread-affinity) is similar to the statelessness of a web server such as IIS; one or more threads from a large pool are injected into the tasks of processing, rendering, and serving up huge numbers of web pages, after which those threads are recycled back into the thread pool for execution of other tasks for a highly concurrent and scalable service platform.
So herein lies the trick: in order to split this coupling between data flow and control flow, a different means is needed to compose the two coordination strategies. In C# and other popular imperative programming languages, methods implicitly pass thread control along with data arguments (the message), and the use of the stack for method calls asserts constraints on control flow, so making the CCR work involves some interesting patterns.
That’s why port-based programming is hard to get your head around. It’s such a large shift from common sequential logic and it requires some additional planning (and good visualizations). It’s obviously important to have a good set of tools for expressing that coordination, a simple set of conceptual primitives that allows us to compose arbitrarily-complex coordination patterns. These primitives, including Message, Port, PortSet, Dispatcher (thread pool), and others provide the building blocks that we can use to define these relationships. Once we define what we want to happen with these building blocks, the CCR can make it all happen.
This level of coordination is a level beyond the strategies used by most concurrent applications and frameworks in software today, primarily because there hasn’t been a pressing need for it until recently–processors had been growing phenomenally in speed for many years. Now, however, we’re told that processor speed has plateaued, that we now have to scale out to scale up, spreading the processing load across multiple cores. We are very fortunate that the work being done by researchers in fields like robotics can be applied in other service oriented architectures, and is a testament to the widespread use of the .NET Framework and the fantastic efforts of some very bright individuals.
Where to Find Microsoft Robotics Studio
Robotics Studio is a free download and can be found here, and while it is full of other good stuff, it’s worth checking out for the Concurrency and Coordination Runtime alone.