ALH Group

started a topic
almost 3 years ago

Better management of Failed Jobs

Would be great to be able to properly manage/handle failed jobs.

Currently when setting up the recipe, if there is a failed job, and it will never be a successful job because it shouldn't have been run (trigger or something wrong in the recipe), we would like to be able to delete it.

Effectively 'managing' failed jobs, so that when looking at recipes, the number in red for failed jobs, are to be actioned alerts for us. Rather than something that can never be solved.

Currently I have no way of marking that fail as something looked at and resolved. Instead its a red alert number forever glaring at me...

5 people like this idea

K

Kai

said
almost 3 years ago

Hi there! Thanks for pointing this out, we appreciate your feedback! May I ask if red marking is not a good reminder that it is a failed jobs? Did you try to rerun the failed job when you have resolved it? The red color mark should disappear if the issue in the recipe is indeed resolved:) What do you think of adding more factors to the filer?

J

Jeremy Farber

said
over 2 years ago

Also being able to delete failed jobs would be great. I'd like to not have any failed jobs and the only way to get rid of failed jobs is the copy the recipe and delete the old one.

5 people like this

S

SPS Admin

said
over 2 years ago

I second Jeremy's suggestion. It would be great to clear out failed jobs and get a better idea of what's going on when a large job is run.

3 people like this

C

Chris Gooding

said
over 2 years ago

I couldn't agree more. It's hard to keep it organized when you have errors showing that you don't need to rerun but also can't delete.

2 people like this

C

Cameron Blackmon

said
over 2 years ago

I agree completely with what everyone is saying here!

1 person likes this

Allan Teng from Workato

said
over 2 years ago

Hi all, instead of deleting jobs, would it be a better approach to allow you to mark the jobs as completed manually?

Meanwhile, there is a way to accomplish this. See recipe logic below.

Basically, the first step of the recipe is to Get the latest information of the trigger record. From there you can add a condition around this new get detail step to have a logic that stops the recipe successfully.

J

Jeremy Farber

said
over 2 years ago

For me failed jobs is more keeping up with issues and knowing that they have been resolved. Looking at a recipe and seeing a gray zero in failed vs a Red Number tells me that all is good. I like to think of failed jobs as cases. I think marking them completed would work great.

I just want to look at a recipe and see no red. If that is manually changing a job to completed that would be fine. If a job is failing a lot because of testing then once it's running correctly i will usually just copy it and delete the original so that I'm starting at zero failed jobs.

Having a dashboard or consolidated lists of all failed jobs would be even better.

1 person likes this

Allan Teng from Workato

said
over 2 years ago

That makes sense Jeremy. Let me see what we can do.

J

Jeremy Farber

said
over 2 years ago

Something to add on this. I've noticed that if you add error monitoring to the recipe that the job does not show up as failed. I would change this as the job still failed but it completed the actions for the error monitoring. Given that the only way to rerun a job is manual you still would need to take some action on the job. When you use this feature you can't filter out failed jobs anymore. Maybe giving the option on the Monitor for errors action would allow this to be used that way or not.

Justin Ng

said
about 2 years ago

Thanks for highlighting, Jeremy.

Here are two recommendations that can help in such a situation.

1) If you would like to fail the job after the job is processed, you can process error handling logic in 'On error' (or not). Finally, Stop job with error afterthe monitor.

The error catch within the error monitor can be configured to send an email or log a message including information about the failed job/error for your rectification. It's up to you to map the fields. You are right to say that the job will continue without failing, until it reaches the Stop job with error action.

Use case: You are processing a list of 100 items within the error monitor and an error is encountered on the 1st item. An error message is logged, and the job continues with the other 99 items, before failing at the end.

2) If you would like to fail the job immediately after an error is caught by the monitor, nest Stop job with error inside the 'On error' block.

Use case: You are processing a list of 100 items within the error monitor and an error is encountered on the 1st item. An error message is logged, and the job fails immediately.

===

Making use of Action with error monitors as well as Stop job with error actions allow you to pick up errors and rectify your recipes quickly. It also gives you better control over which jobs complete or fail by deciding how your recipe handle errors.

We hope this helps with your workflow. Have a go!

1 person likes this

B

Brad Eisenberg

said
over 1 year ago

I agree with so many of the ideas in this thread. Error monitoring is a real pain with existing functionality, particularly when running lots of recipes and lots of jobs. If I can't stay on top of re-running every single job with an error (as many people point out is often resolved manually instead) then the red error count becomes useless.

I'd very much appreciate a centralized dashboard for Jobs rather than having to open each recipe individually to see if there are any "new" errors.

Alternatively, the count of recipe Jobs and Errors should be more dynamic or time-sensitive so that I can see what's happened over the past day/week/month rather than seeing the all time count, which is pretty useless.