New Feature: Workflow System Status

We have all been driving our car and at some point a light comes on the dashboard. Sometimes it is a simple orange light like the windshield fluid. We should top that up but I can keep driving without harm likely (unless I can no longer see the road). The dashboard might similarly show me an orange check engine light. This usually means you need to get your car into the shop but it isn't an immediate concern. Alternatively, the same light might show red telling you a serious problem has occurred in your engine. You need to stop driving now. In the recent RSA Identity Governance and Lifecycle 7.1 release, we have introduced a similar concept focusing on workflow system status.

The Admin->Workflow→Monitoring page will show you a real time view of the workflow system status. This includes graphs for how hard it is working (Number of Items Serviced), if anything is backing up (Queue Size), and system status indicators. The status indicators only show if there is an issue. Not only do the status indicators surface that there is a problem, they generally have a means to resolve the problem or at least get more details. A status indicator will show a hand cursor if you can click it for more information to resolve the issue. In addition to the visual indicators, the system will send out admin errors with the appropriate status and information. The administrators can configure Notification rules to email these events to the appropriate administrator.

The system is configured to monitor the following conditions and surface workflow status indicators.

Verification (Count)

This status indicator determines how many changes are pending verification that are older than one month and less than 12 months.

Thresholds

Warning - 100 changes

Error - 500 changes

Critical - 1000 changes

Resolution

This status indicator allows you to click through to a screen that shows the changes that we are trying to verify. The verifications will be dealt with by future collections or an administrator can choose to cancel a change here to remove the verification.

Verification (Age)

This status indicator determines if there are any changes pending verification that are older than n months

Thresholds

Warning - no warning by default

Error - There are changes older than 6 months that havent been verified

Critical - There are changes older than 12 months that havent been verified

Resolution

This status indicator allows you to click through to a screen that shows the changes that we are trying to verify. The verifications will be dealt with by future collections or an administrator can choose to cancel a change here to remove the verification.

Queue Backup

This is a series of status indicator (one for each priority queue type) that will show if work

Thresholds

Warning - 1000 ms by default

Error - 2*60*1000 ms by default

Critical - 4*60*1000 ms by default

Stalled Workflows

This status indicator determines if there are any workflows marked as stalled.

Thresholds

Warning - 0

Error - 50

Critical - 100

Workflows should not ever be marked as stalled. So even one is being considered a warning.

Resolution

This status indicator allows you to click through to see the stalled workflow jobs. In general, a stalled workflow needs to be examined more closely to see if there is some flaw in the business logic. A stalled workflow indicates something took longer than expected. From this screen you can also evaluate the workflow(s) to see if they can proceed.

Database Connections

Thresholds

Critical - Any exception thrown by the workflow engine that it can no longer communicate with the database

Resolution

Clicking this status indicator icon opens up dialog where an administrator can check if the workflow engine can communicate with the database. If the connection is successful, the status indicator is cleared and an admin error is logged for change of status.