Multithreaded Design: Dedicated Task Threads or Bucket Brigade Strategy?
Posted by Dan Vanderboom on December 17, 2007
A few days ago, I took a dive into building a new software product. It’s aim is mobile developers, such as myself, whose applications access databases on those mobile devices. After six weeks of development, v1.0 will be released (by the end of January, 2008). More details on that product to come later.
One of my goals is that its user interface should be very responsive. When commands take a while to complete, they’ll need to execute on some other thread. It’s a .NET Framework application, and in that environment, only one thread can update the user interface, so it’s a rare resource that needs to be protected from doing too much work that doesn’t fit that primary purpose.
I’ve written a fair amount of multithreaded code in the past. Working with RFID hardware before the “Gen 2” standard, on pre-release Symbol (now Motorola) devices, I figured out how to integrate with the RFID radio at a low level, and only by using several threads was I able to obtain acceptable performance. After wrestling with the issues like race conditions, deadlocks, and other complexities, I have a healthy respect for the amount of planning that’s required to make it work correctly.
My new product consists of three major layers: a user interface layer, a service layer, and a network communication layer. The Command pattern will be used to publish and service commands. Actions in the user interface will enqueue command objects to the service layer, where they’ll either be processed immediately or passed to another node in the network for processing remotely.
A simplification of this scenario is illustrated in the sequence diagram below. Each color represents a separate thread.
The red user interface thread does minimal work. When it calls the DataServiceProxy to request data, a command object is enqueued and the thread is allowed to return immediately to the user interface. The blue line represents one thread in a larger pool of worker threads that grows or shrinks depending on the command queue size, and is dedicated to processing these commands. In this example, the DataServiceProxy makes a call to the TCPConnection object and returns back to the pool. When a message does arrive, this happens on a thread owned by the TCPConnection object. Because we don’t want this thread to “get lost” and therefore be unavailable to process more TCPConnection logic, we enqueue a response object and return control immediately. Finally, the DataServiceProxy object, using one of its threads, fires a MessageReceived event, which in turn calls Invoke so that it can update the user interface on the only thread that is capable of doing so.
I like using Sequence diagrams like this to plan multithreaded work. It clears up issues that you wouldn’t normally be able to visualize clearly. The first (and worst) mistake would be to allow the UI thread to cross all the way through to the TCPConnection object, making some blocking call while the network connection times out, throws an exception, handles the exception, etc. Meanwhile, the user is waiting until they can interact with the screen again, waiting for the hourglass to go away so they can get on with their work. By assigning specific roles to threads, and creating mechanisms by which threads can be kept within specific boundaries, their behavior is easier to predict and manage.
I don’t know a lot about how ThreadPools behave, but I’m going to be looking into that in detail over the next few weeks, and if I come across anything particularly noteworthy, I’ll write about it. My instinct is to continue to use this pattern of thread role assignment, especially for my new product, but there’s another pattern I’ve been thinking about that may also work well. A friend of mine at Bucketworks mentioned to me a while back about the efficiency of bucket brigades, and about how intensely they’ve been studied and found to create self-organizing and automatically-optimizing lines of workers.
Each Thread Is A Worker
The way I’ve been modeling thread work, I treat each thread like a worker: the UI worker hands off a job description to a DataServiceProxy worker, who then does something with it and passes it off to a TCPConnection worker, who in turn does some work with it and hands it off (over a network in that case). So what happens when the scenario becomes more complicated than the simple sequence diagram above? What if I have four or five hand-offs? Is the assignment of specific threads to specific roles really the best use of those threads? The UI thread is stuck, but the rest of them could technically move around. Perhaps they’re devoted to one task at a time, but when a task is completed with nothing in that queue, could the worker thread be moved to perform another task? So I started thinking about how a bucket brigade strategy may help.
Threads aren’t exactly like people workers, so I have to be careful about the analogy. It’s probably no big deal processing-wise for threads to be sleeping if they’re not used, but each thread does consume some memory, and there must be some kind of overhead to deallocate and reallocate their thread-local resources. Would it be better to share a single thread pool among a connected linkage of worker objects, or for each service object to allocate its own thread pool?
Will this work? Bucket brigades tend to work best when the density of work is high and the amount of time to complete a task is relatively small and constant. If a task takes too long to complete, the line can more easily become unbalanced. I’m going to be thinking more about this possibility, and come up with some objective measurements that I can use to compare the approaches and determine when this would be appropriate in a thread-management strategy (if at all).
Bucket Brigades In Business
If you want some background on bucket brigades and their use in manufacturing and assembly lines, check out this paper by John Bartholdi and Donald Eisenstein. They explain how expensive time-motion studies are done to balance assembly lines in large organizations to maximize throughput, and how this needs to be adjusted or redone when the length or configuration of work changes, or when worker productivity varies. With bucket brigades, however, there’s less need for planning because the strategy “spontaneously generates the optimal division of work”. Pretty neat stuff.
The list of companies using bucket brigades is impressive: McGraw-Hill, Subway (yes, the sandwich company), Readers Digest, Ford Customer Service Division, The Gap, Walgreen’s, Radio Shack, etc.
If anyone’s using this kind of strategy to manage work in software systems, I’d love to hear about it.