What's so hard about Event-Driven Programming?

I was lucky enough to attend the Software Architecture Workshop in Cortina recently. It was a three day workshop based around the idea of Open Spaces, which involves handing the asylum keys to the inmates and seeing what happens.

I convened a session called “What’s so hard about Event-Driven Programming?” to explore the experiences of the other delegates in designing, implementing and testing asynchronous, event- or message-driven systems. I took the position that actually it was all very straightforward as long as you followed a few basic principles. Another delegate, Mats Helander, took the opposing view that asynchronous, event-based systems could develop scary emergent behaviour that would lead to a world of hurt.

About eight of us batted this around for a while, meandering into emergent behaviour of dynamic systems, before bringing it back to a realistic enterprise example. Say you are writing a component to process a sales order. You need to calculate the order price, persist the order data and send a notification email. So in Java you might have a method that looks like this:

public void processOrder(Order order) {
 pricer.price(order);
 persister.persist(order);
 notifier.sendNotification(order);
}

You have services to price, persist and notify about the order and you call them synchronously one after the other.Let’s say that calculating the price of the order is time-consuming and happens remotely, persisting the trade is quite quick, and sending the mail (via a remote mail gateway) takes somewhere in the middle. That means that for a lot of the time in the processOrder method, the thread is hanging around waiting for stuff to happen.

As the system handles more and more concurrent orders, this thread-per-order model would create a lot of mostly-idle threads which would eventually cause the VM to implode.

Thinking concurrently

Instead you could have three queues, a PricerQueue, a PersisterQueue and a NotifierQueue. You could represent the order processing as a ProcessSalesOrder message, which would know that it had to start on the pricer queue, then get itself passed onto the persister queue and finally onto the notifier queue.

Each queue (which could just be a linked list in memory; it doesn’t have to be very clever) would have a number of consumers, each in their own thread. The processOrder method puts your order onto the pricer queue. When the order gets to the front of the queue, it gets priced by the next available pricing consumer, and then handed on to the persister queue. Likewise, once it has been persisted it gets passed to the notifier queue.

So that’s our basic asynchronous model: a sequence of synchronous calls to services is replaced by passing an event or message through a series of queues or stages. (Of course multi-stage, event-driven systems are nothing new).

But what does that give you? You just replaced a nice, simple three line method with a bunch of queues, events, consumers and goodness knows what else. So what? Well the thing is, now you can get clever. You can monitor how big each queue gets, and change the resource allocation of consumers on the fly. The pricer queue is getting a bit full? Let’s add some more pricers. We can take a few threads away from the quiet persister queue; no-one will notice. As long as you are reasonably careful in defining what each queue does, you shouldn’t run into any issues with locking or race conditions. Most importantly, the application becomes massively scalable, and its behaviour will degrade gracefully under load. As this wake-up call makes abundantly clear1, we need to already be thinking about designing parallelism and concurrency into our applications, rather than hoping that the the next wave of hardware will make everything fast enough.

Pay no attention to the man behind the curtain

The most interesting part of the discussion for me was in trying to address Mats’s concerns about emergent behaviour and all the other weirdness that can happen when you just let a bunch of queues asynchronously get on with business. We were saved by another delegate, Lennart Ohlsson, who pointed out that 10 years ago we were having exactly the same conversations about object-oriented programming.

“This polymorphism is madness!”, “How can I know which version of play() will be invoked on my Musician variable2, when it could be pointing to a TrumpetPlayer or a Pianist?”. It turns out that when we ignored it and trusted the late-binding pixies, everything just fell into place. If you actually stop to think what’s actually happening when you dispatch a method call in an object-oriented system, you can give yourself a funny turn.

Well it’s exactly the same with parallelism. If you choose to ignore how it works and just leave it to the threading pixies, it all just works. This seemed to be exactly what Mats needed to hear, and we all left the session happy and enlightened.

As a post script to the session, the following day I pointed out to Mats that as a C# programmer, he was already used to ignoring asynchronous, event-driven behaviour in his everyday programming. He looked appropriately sceptical, until I pointed out that he used a garbage collector—or more precisely that he had one lurking around that just got on with business, asynchronously, in an event-driven way, reading discarded objects off a finalize queue and laying them quietly to rest.


  1. Thanks to my colleague Ian Cartwright for the link ↩︎

  2. It’s the example I always remember from the Core C++ book ↩︎