There's a time limit within the IAMgarbageCollectTaskPersistence procedure hard-coded as one hour, which will prevent it from taking up too much sql resources with a this long running task, why not also include record count as a parameter to this procedure rather than just a cutoff time. Cutoff time is ambiguous and it will be easier for DBA's to decipher what are appropriate parameters to pass, especially when running this on a task persistence database in the millions of records. It already accepts cut-off date, but perhaps it could instead to be run in batches, similar to how the bulk loader client works to scale smaller batches of submissions into the IME via TEWS.

Additionally Garbage Collect Statistics table (iam_gcstats) lists all the cleanup data for the previous run, but instead produce these statistics prior to initiating the cleanup and provide feedback when trying to clean up too large of a load. Once a record count could be accounted for then it will be easier for a sql developer to automate the whole thing in a nightly sql job by passing a resonable amount of records to be cleaned per run

run_id

finish

tasksession12_5

archive_event12_5

archive_event_object12_5

object12_5

runtimeStatusDetail12

event12_5

event_object12_5

lock12_5

archive_tasksession12_5

archive_runtimeStatusDetail12

archive_object12_5

runtimeStatusDetailAttribute12

archive_rsdAttribute12

2016-04-29 16:20:38.967

2016-04-29 16:20:40.100

9

41

47

23

11

32

9

41

47

23

11

0

0

Whether or not they’re nested tasks, or just records scattered throughout the various View Submitted tables, the records being archived / purged will be much more than tasksession12_5 root record count used (or calculated by trying various cutoff_times). Suggest using a smaller record counts because their will be diminished performance benefits to using a larger record counts with this new IAMgarbageCollectTaskPersistence Stored procedure The impact on CA Identity Manager will be less, if the duration of each run is shorter, but the frequency of each run is greater. I.E. loop the calls to procedure by sequentially passing new cutoff_times for a set duration. (for a few hours during the middle of the night).

To get comfortable with record count size to use and how long each IAMgarbageCollectTaskPersistence execution takes, it would be much easier if garbage collection statistics were outputted to the statistics table prior to the run and even better if a supplemental parameter of record count could be passed, which would cause the procedure to loop through the clean up in record count chunks.