<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">A little nudge or two from folks more skilled than me at this would be much appreciated.</blockquote></div><br>Ok, here is my take on it:</div>

<div class="gmail_extra"><br></div><div class="gmail_extra">Each workflow instance is a process. This process replays an (immutable) event log from stable storage. Each event in the log is stamped with a time stamp and the process replays the events as if they happened at that point in time. New events are injected to the process with a time stamp. So a process is always "lagging behind" by something, be it weeks or microseconds.</div>

<div class="gmail_extra"><br></div><div class="gmail_extra">When processing new events, these are written to stable storage. Thus ensuring you can reawaken the process later and retrace its steps should it be possible. Note that BPM system upgrades needs to be in the event log too, so you can switch code versions. Write the SHA256 checksum of event log and the internal process state to mnesia so you can avoid retracing steps if the workflow becomes too long.</div>

<div class="gmail_extra"><br></div><div class="gmail_extra">Processes have lifetime. When they are active, they are registered in a registry, like gproc. When you know they are going to be inactive for a long period, you can terminate the process, but keep a notion of the workflow instance on stable storage, so you can restart it if necessary.</div>

<div class="gmail_extra"><br></div><div class="gmail_extra">A message for an inactive workflow awakens it.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Use timers to terminate inactive workflows. Or don't. You can probably have a million workflows in a couple of gigabytes of memory.</div>

<div class="gmail_extra"><br></div><div class="gmail_extra">The advantage of this model compared to a database-centric one is that the process can act "on its own". That is, rather than operating on the data, the *process* is the data and it can communicate and spew out events itself.</div>