Workflows stuck at Pause

As part of my defensive workflow design (and as advised by Nintex Support) I've added a Pause before each task in a workflow. And now I sometimes see that some workflow instances will be held at that Pause for hours, even days now. Maybe never continuing. (The Pause is set for 1 minute in business hours). Ans users never find out about this if someone doesn't specifically go to check workflow history.

In this context I have 2 questions:

1. Is there a way to find out which workflows are held at that Pause - a report of a kind for all In Progress workflwos in which the current action is Pause?

Re: Workflows stuck at Pause

Are you actually putting a "Pause action " before any "assign flexi task action" ? Was it a recommended as part of a defensive design? I have seen "Commit Pending Changes" as more of a part of defensive design so it completes anything that's pending at SP level before giving control back to Nintex..

Re: Workflows stuck at Pause

Just to clarify you have tried restarting it and it did not work correct? Even manually restarting the SharePoint timer job? Its one of those glitches I guess.

I guess the main point which I wanted to get some more thoughts from people in the forum was actually adding "Pause" action before any Flexi task. I haven't seen this practice before as generally "Wait for Item update" and "Pause" sometimes create issues and can just wait indefinitely. You could search in the forum for more issues which people have faced similar to this and there would be many.

As part of your history if you are okay to query or code, you could jump to the Nintex workflow history log (Site Settings -> Nintex workflow history) and it logs all the events there. Pause would also put in Event Type as 11 (similar to comments) but if you have not modified the comments in the Pause action then the default message is "Pausing for 5 mins". Nintex workflow history log also has columns List ID (GUID for your SP list) and Primary Item ID (ID for the listitem on which the workflow is running). If the pause completes it adds another event "Pausing complete". So you have the data there if it is easier for you to build something out of it.

Btw you also have Nintex Db's (which I believe we should not touch similar to SP content Db's) but if you really want to get some reports in a non-production env) or get a backup of production db's on a seperate DB server and extract the history out of it.

Re: Workflows stuck at Pause

I used the above PowerShell commands to restart the Timer Service and yes, the workflows are still stuck on the Pause. I am not sure if that's the best way to do it, certainly not the only Is this what you mean by "manually restarting"?

I went over all the workflows and removed the pauses, I left a Pause only in the beginning of each workflow (this is recommended for sure).

Thanks for the hint on the History list. I can't code but I exported it to Excel and am trying to narrow down the results.

Re: Workflows stuck at Pause

Thank you for your advice. I ended up removing the Pause and leaving only the one at the beginning of the workflow. Not the greatest solution obviously, but the whole process is split into several workflows and is currently working properly. I may try to simulate that behaviour in a test environent and explore the options you suggested.

Re: Workflows stuck at Pause

I think the core issue would be to investigate the RAM on the server, and how many workflows you have running at once. The KnowYourWorkflow tool can help with that. If pauses are helping your workflows to function, one at the beginning, then the workflows are being processed by the timer service (owstimer) instead of the app pool where all workflows originate (w3wp). But there are per web app limits, and if multiple web apps are using the same app pool, then even more restrictions. Any limits hit on the IIS Worker service, or app pool, are queued in the database for the timer service to pick up later. And its FIFO queue.

IIS - Needs RAM, has a 15 workflow process bucket available at once. Add RAM, or split web apps per app pool. It is not recommended to change these system settings and manipulate sizes and timings via PowerShell. Other option is to have more WFE servers as user load will be split on the web app app pools on different servers. So you get a 2x scale when adding another server.

Timer Service - does really well working on workflows, a bit faster too. Needs RAM and good database management. Timer services run on nearly all WFE and App servers.

If pausing at the right time helps, you may need to tune IIS. If pausing goes a bit crazy at times, then server tuning, database tuning can help. But, note, that you can over pause workflows. Unnecessarily pausing causes too many queues. Think of it like driving downtown. You get farther when there are more green lights. The more red lights give you time to drink your coffee without spilling, but you don't get to your destination quickly and you back up everyone behind you.