I'm having great trouble designing an application which uses jBPM5 in a scalable way. It seems to be impossible on a practical level. I hope that this is just due to my misunderstandings of the way the jBPM5 works.

I have listed each of my assumptions, and the ramifications of each below. I hope that someone from the community will be able to point out where these assumptions are incorrect.

I would particularly appreciate it if anyone has managed to scale jBPM5, and could described how they achieved it.

1a. it's not scalable to use a single session to execute all processes, or you would suffer contention on the session info.

1b, it's not scalable to use a single session to execute all processes in a cluster, or the updated session info would have to be continually synchronised across the cluster.

2. When using BPMN2 events, jBPM5 only allows you to send events to the process instances within a single session at a time. You need to maintain a list of all the sessions which have incompleted process instances(*), and loop through them all to send events. Therefore:

2a. you should execute all processes in as few sessions as possible, to lessen the number of iterations through this loop.

3. jBPM5 persists BPMN2 timer info in the session info, but the session must be active (ie. loaded from persistence) in order for the timers to activate. Therefore:

3a. when your application starts, you must load all sessions that have active process instances that have timers(**).

3b. you must not have the same session active in two different nodes of a cluster, or the same timers will expire around the same time

3c. when a node crashes, your application must detect this and reload the sessions that were active in the crashed node

4. If you start a process instance in a session, that process instance must always be executed in that session.

4a. when a node wishes to resume a process instance that was persisted, it must first (due to 3b) ask all other nodes if they have the session active, and if so instruct them to dispose it. It can then load the session, load and resume the process. All while preventing race conditions.

4b. when a node receives an event it must (due to 2) carry out all the processing in 4a for each session with active process instances.

(*) I don't think it's possible to know if a session has incompleted process instances..?

1a. it's not scalable to use a single session to execute all processes, or you would suffer contention on the session info.

Why do you want to use a single session?

1b, it's not scalable to use a single session to execute all processes in a cluster, or the updated session info would have to be continually synchronised across the cluster.

Yes, its madness to think that you can do that in a simple way. At this point having multiple sessions is a must if you want to divide where the execution can happen.

2. When using BPMN2 events, jBPM5 only allows you to send events to the process instances within a single session at a time. You need to maintain a list of all the sessions which have incompleted process instances(*), and loop through them all to send events. Therefore:

2a. you should execute all processes in as few sessions as possible, to lessen the number of iterations through this loop.

Or you can use a ESB or an lightweight integration framework to do that job like for example apache camel or Switchyard.

3. jBPM5 persists BPMN2 timer info in the session info, but the session must be active (ie. loaded from persistence) in order for the timers to activate. Therefore:

3a. when your application starts, you must load all sessions that have active process instances that have timers(**).

You can create a set of session with process that contains timers and a different set of sessions with the rest of the processes. I have an alternative solution for this kind of issues, which is to delegate those timers to an external component.

3b. you must not have the same session active in two different nodes of a cluster, or the same timers will expire around the same time

Why do you want to have the same session in multiple nodes?

3c. when a node crashes, your application must detect this and reload the sessions that were active in the crashed node

This is extremely related with your application, and yes you need to do your housekeeping

4. If you start a process instance in a session, that process instance must always be executed in that session.

4a. when a node wishes to resume a process instance that was persisted, it must first (due to 3b) ask all other nodes if they have the session active, and if so instruct them to dispose it. It can then load the session, load and resume the process. All while preventing race conditions.

Why do you want to have the session in multiple nodes? you should think in a different scheme where a retry mechanisms kicks in if something goes wrong. That retrying mechanisms can be an external component that keeps track of the sessions and know where and how to load them.

4b. when a node receives an event it must (due to 2) carry out all the processing in 4a for each session with active process instances.

I can't answer your 'why would you want...' questions because I don't want any particular configuration of sessions and processes. I just want advice as to a configuration of sessions and processes that will scale. Sorry if that wasn't clear.

2. When using BPMN2 events, jBPM5 only allows you to send events to the process instances within a single session at a time. You need to maintain a list of all the sessions which have incompleted process instances(*), and loop through them all to send events. Therefore:

2a. you should execute all processes in as few sessions as possible, to lessen the number of iterations through this loop.

Or you can use a ESB or an lightweight integration framework to do that job like for example apache camel or Switchyard.

We happen to already be using Camel. But I'm not sure I understand how it would help. Can you explain further?

If you mean that I should route the event to exactly the sessions that may be waiting for that event, then that implies I must maintain a reference between all events that process instances are waiting on and the session in which they are executing. This seems like something jBPM should do. How would I get this information using jBPM5's API to allow me to build such a thing?

3. jBPM5 persists BPMN2 timer info in the session info, but the session must be active (ie. loaded from persistence) in order for the timers to activate. Therefore:

3a. when your application starts, you must load all sessions that have active process instances that have timers(**).

You can create a set of session with process that contains timers and a different set of sessions with the rest of the processes. I have an alternative solution for this kind of issues, which is to delegate those timers to an external component.

Are you able to delegate timers to an external component and still use the BPMN2 timer constructs? Can you please explain how? I would like to use the Java EE Timer service. As it's persistent, it avoids the problems inherent in using jBPM5's timers (removes assumption #3).

4. If you start a process instance in a session, that process instance must always be executed in that session.

4a. when a node wishes to resume a process instance that was persisted, it must first (due to 3b) ask all other nodes if they have the session active, and if so instruct them to dispose it. It can then load the session, load and resume the process. All while preventing race conditions.

Why do you want to have the session in multiple nodes? you should think in a different scheme where a retry mechanisms kicks in if something goes wrong. That retrying mechanisms can be an external component that keeps track of the sessions and know where and how to load them.

I need to have the session active so that its timers expire. So the session must be active in exactly one node.

We have a retry mechanism. It is incumbent on all jBPM5 users to have a retry mechanism to deal with optimistic locking exceptions. I'm afraid I lack the imagination to see how this would avoid the problem I outlined above. Can you please explain further?

It might be easier to present a specific use-case, rather than looking at the problem in general terms. Suppose (use-case 1): application has a single process definition. Process definition contains a BPMN2 receive task and a BPMN2 timer. I want to deploy this on a cluster for high availability and horizontal scalability. How can I do this with jBPM5?

All process have timers, so I can't use one set of sessions for processes with timers and one set of sessions with for processes without. All sessions with one or more incomplete process instances must be loaded from persistence for their timers to expire - exactly once across the cluster. The incoming message for the BPMN2 receive task could arrive at any node of the cluster. Therefore, when this occurs, the application must: loop through all sessions that have incomplete process instances; instruct the node that has that session loaded to dispose it; load the session; fire the event to the session. When completed this process all sessions will be loaded on one node, so will need to rebalance sessions. This obviously unworkable... can you suggest an alternative?

Some specific questions:

Is it possible through the jBPM5 API to know if a session has incompleted process instances?

Is it possible through the jBPM5 API to know if a session has active timers?

You didn't say any of my assumptions 1-4 were wrong. Does that mean they were all correct?

With respect to incomplete process instances, we have one main process per kSession, and it is possible to determine if the process instance is active by call ProcessInstance.getState(). Our client applications know which sessions belong to them and interact with the session and process and tasks until the process becomes completed.

Timers seem to present challanges to scalabilty though. For us it means we have to keep sessions with active processes in memory and manage them, and simply use the persistence mechanism for recovery from failure. Clustering seems like alot of headache. So what you're left with for scalabilty is some active grid solution where client requests are routed to the proper server. I believe this is the solution Mauricio has been working on.

Good show Arnold. Came here to basically find answers to your same questions. From what I have gathered and expierenced, you are correct. There is no practical way to deploy this in a cluster, as I have been trying for 4 weeks now. Whenever faced with this question, I have found the developers here give the same vauge answers with no real solutions.

Please could somone from the JBPM team reply beacuse I myslef don't want to do the session configurtation myself as this is traditionally considered an BPM engine's responsility / ability which gives that the ability to scale.

1a. it's not scalable to use a single session to execute all processes, or you would suffer contention on the session info

If you only have one session, there would be no contention as the ksession would solve this

But if you have multiple sessions, they should have a different session id, yes, as otherwise they would be contending

1b, it's not scalable to use a single session to execute all processes in a cluster, or the updated session info would have to be continually synchronised across the cluster

Correct, in a cluster, each node in the cluster would have a different session, you should avoid having multiple sessions with the same id active at the same time (within or across multiple nodes of a cluster) to avoid this contention

2. When using BPMN2 events, jBPM5 only allows you to send events to the process instances within a single session at a time.

There are two methods, one to signal one process instance and one to signal the ksession (all process instances), but the ksession can load process instances on the fly. Actually, by default, there are no process instances in a session, they are all loaded on demand. For example, if you complete some task, the session will be notified, it will automatically load the process instance in question and signal it. The same is done for other features like signaling, timers, etc.

You need to maintain a list of all the sessions which have incompleted process instances(*), and loop through them all to send events.

When signaling the session, the engine will figure out which process instances might be waiting for them and load them on the fly.

Therefore:

2a. you should execute all processes in as few sessions as possible, to lessen the number of iterations through this loop.

While I believe this isn't really necessary based on the previous argument (as a session can load process instances on the fly), this isn't per se a bad strategy (it's probably a good idea to not more sessions than you need active at the same time).

3. jBPM5 persists BPMN2 timer info in the session info, but the session must be active (ie. loaded from persistence) in order for the timers to activate.

Correct

Therefore:

3a. when your application starts, you must load all sessions that have active process instances that have timers(**).

Correct, we are working on a strategy that allows you to dynamically reload sessions on the fly for timers, which will be available in the next release. If you can limit the amount of active sessions that might contain timers though, this doesn't necessarily have to be a problem.

3b. you must not have the same session active in two different nodes of a cluster, or the same timers will expire around the same time

Yes, see 1

3c. when a node crashes, your application must detect this and reload the sessions that were active in the crashed node

If those sessions were holding timers, yes. This is actually a common approach: have N active sessions distributed over M nodes, if a node goes down, sessions are redistributed amongst the available nodes.

4. If you start a process instance in a session, that process instance must always be executed in that session.

No, process instances can easily continue execution in a different session than it was created in

4a. when a node wishes to resume a process instance that was persisted, it must first (due to 3b) ask all other nodes if they have the session active, and if so instruct them to dispose it. It can then load the session, load and resume the process. All while preventing race conditions.

No, you can have the same session active on multiple nodes if you want. This means however that there is a risk of conflicting commits. If the risk on conflicting commits is high, this should be avoided. But there are situations where this would be totally acceptable. For example, consider a situation where you use a specific session per process instance. The chance of having multiple requests coming in for the same process instance is generally low, and therefore instantiating and disposing sessions on request on whatever node the requests comes from would be possible.

4b. when a node receives an event it must (due to 2) carry out all the processing in 4a for each session with active process instances.

Not sure what you mean by this

(*) I don't think it's possible to know if a session has incompleted process instances..?

Depends on what you mean. As I explained in 2, as session doesn't keep all active process instances in memory, but they are loaded on demand, only the process instances that are currently executing are kept in the session. At any point, you can ask the session which process instances it is executing. If you want to konw which process instances are active, we recommend using the history log for this (as depending on your situation this could be a very large number).

(**) I don't think it's possible to know if a session and timers..?

You can ask a session its idle time, which means the time it expects to be idle (typically until the next timer would fire). You can also ask the timer manager which timers it has if you need more detail

Note that for 6.0, we will be providing a lot more advanced session management out-of-the-box. You'll be able to use these session managers (in different configurations, like one singleton session or session per request etc.), and the jbpm-console (which currently uses one singleton session, or when used in a cluster one session per node) will also support these.

Thanks very much for your informative reply. In fact we found a way to make jBPM5 scale - but it was a lot of work. We wrote our own persistent timer implementation and execute each process instance in its own seesion. I will write up the details in another thread soon in case it helps someone.

1a. it's not scalable to use a single session to execute all processes, or you would suffer contention on the session info

If you only have one session, there would be no contention as the ksession would solve this

I did not know this and this is could be the root of my misunderstanding. But I can't understand how the ksession could solve this. If process instance A executes in transaction Ta and process instance B executes in transaction Tb, and both transactions change the session info - which is simply a JPA entity - how do the two process instance executions NOT serialise?

Note that for 6.0, we will be providing a lot more advanced session management out-of-the-box.

This sounds good, because session handling with 5.x is very difficult to get right for jBPM users. It's difficult enough to just detect when a session can be safely disposed. I'm not sure how other users are doing it. Unless you're not using timers, not using persistence, and happy to restart your application periodically then it's incumbent on jBPM user to write a lot of tricky session management code.

In fact, I can see how these sessions might be useful for Drools users, but from the point-of-view of a jBPM user - who just wants to execute processes - it's hard to see what the concept of sessions gives you? If you just want to execute business processes, why should you care about "sessions"?

If you only have one session, there would be no contention as the ksession would solve this

I did not know this and this is could be the root of my misunderstanding. But I can't understand how the ksession could solve this. If process instance A executes in transaction Ta and process instance B executes in transaction Tb, and both transactions change the session info - which is simply a JPA entity - how do the two process instance executions NOT serialise?

The engine will make sure request a is completed before starting request b.

In fact, I can see how these sessions might be useful for Drools users, but from the point-of-view of a jBPM user - who just wants to execute processes - it's hard to see what the concept of sessions gives you? If you just want to execute business processes, why should you care about "sessions"?

A session is the runtime interface to talk to the engine, so you do need it. But you are right, when you don't need any rule-related persistence etc., you probably don't need some of this "advanced" session management. In situations like this, the two most common architectures are:

* Session per request: if your session doesn't contain any state and you don't use any timers, you can just instantiate a session on request and dispose it afterwards

-> session mgmt is fairly simple, just create / dispose every time

* Singleton session (or as extension N sessions): you have a service that reuses the same session(s), and keeps these sessions alive at all time (to support timer execution), possibly distributed across a cluster of nodes

The engine will make sure request a is completed before starting request b.

Ok, that's what I mean by serialisation - executing the process instances one at a time.

One final question! You said: process instances can easily continue execution in a different session than it was created in. Supposing I had facts inserted into my knowledge session. Those facts would then available for all process instances that are executed using the session. But I want to have rule tasks in my processes, and for that they need to private to the process instance. Is the only way to achieve that to have each process instance in its own session? Additionally, I suppose the same is true of globals - they are global to the session and available to all process instances that are executing using the session?.

If I understand your question about the Singleton Strategy, it depends on the use case. If you create a session per process instance it will work fine, I'm not sure why you mention that is not a practical strategy.

I thought Singleton Session strategy (Kris mentioned earlier) is to have one ksession per VM. Is that correct? If so, I am questioning if that is a practical strategy (and NOT the session per process instance) because that is "synchronize and suffer" model of concurrency and does not support parallale processing.

Of couse, "session per process instance" and "session per request" might work if 1) an external system cordinates timer tasks, and 2) on every event the session info and process info is retrieved/inserted from/to db and unmarshalled to memory. 1 requires temporal task/event cordination to happen outside jBPM (that application developers writes). So, did jBPM really support BPMN 2.0 timer task? Due to 2, anyone trying to have jBPM in a cluster for better performance has another performance issue (with network lag in DB roundtrip or second level hibernate cache etc.).