PurgeAccumulator

What is the PurgeAccumulator?

Hit statistics for Blackboard are recorded in the ACTIVITY_ACCUMULATOR table, which gets very large over time. The PurgeAccumulator job is designed to copy this data over to another database schema intended purely for reporting purposes and to maintain (purge) such hit data from the production schema after it has aged out (30 days). As a side effect, it also calculates some summary data for the day's activity.

How do you run the PurgeAccumulator?

This is one of the scheduled tasks scheduled from config/bb-tasks.unix.xml for Unix operating systems and from config/bb-tasks.windows.xml for Windows operating systems in 9.1 environments and in config/bb-tasks.xml in other Bb versions with the following stanza:

xml.registered.delay - time of day (24-hour clock) when task will run (but not earlier than delay parameter specifies, i.e. if at least 6 hours after JVM startup haven't gone by, it won't run at this hour)

days_to_keep - number of days of stats activity to keep in production (bb_bb60) schema as well as in reporting (bb_bb60_stats) schema (parameter specific to this task only)

target - whether to purge live or stats data - must always be "live" in bb-tasks.xml (parameter specific to this task only)

How does the PurgeAccumulator work?

The PurgeAccumulator task executes a number of stored procedures while logged into the database as the BB_BB60_STATS user.

As first part of this task the daily_system_tracking procedure is executed in the BB_BB60 schema, which calculates today's overall system statistics and inserts them into the system_tracking table. If this procedure is not executing, the General System Statistics overview in the GUI shows zeroes. This is the easiest way to determine from the GUI that there was a problem with PurgeAccumulator.

Afterwards it copies the most recent hit data from the production database into the stats DB using stored procedure bb_bb60_stats.sync_activity_accumulator. On Oracle, this procedure works in batches of 2000 records.

Then it drops all data from all other tables in the BB_BB60_STATS schema (by truncating the tables) and copies the current data from the production DB (BB_BB60 schema) over. In the case of virtual instances (VIs), this works similarly for the BB_VINAME and BB_VINAME_STATS schemata. This generates a lot of redo/transaction logging, since some of these tables are quite large. The tables are:

application

course_main

course_roles

course_users

data_source

institution_roles

navigation_item

system_roles

system_tracking

users

user_roles

Finally, the purge_accumulator procedure is executed in the BB_BB60 (production) schema, with the days_to_keep value as only parameter. This deletes all hit data older than days_to_keep days from the production DB. It works in batches of 2000 records at a time.

Schedule PurgeAccumulator

Unless commented out in this file, each appserver and each tomcat node will try running this task, but only one of them should run per day. Schedule PurgeAccumulator manually.

The first of these is the normal PurgeAccumulator run scheduled nightly, and the second purges historical tracking data from the stats DB if one finds it too old and stale. That second method is only used during manual invocations of the tool.

The second parameter to the tool ("bb_uid") is always bb_bb60, except for institutions running multiple virtual installations.

The third parameter (here "180") is the number of days of stats to keep in the BB_BB60 (production) database schema. The synchronization step run as part of purge-live will always sync the BB_BB60_STATS schema up to the day of running the tool, this is not controllable. You can optionally enter a specific cutoff date in YYYY-MM-DD format instead, meaning "purge everything older than this date."

The tool can also be invoked manually with a "purge-stats" option instead of "purge-live"; this deletes old records from the BB_BB60_STATS reporting schema irrevocably when you do not need them anymore. When invoked with this option, the tool does not do any synchronizing (it does not run the daily_system_tracking or sync_... procedures). For better distinction from the purge-live method this option is designed to take a cutoff date parameter (purge everything older than this date), but a days_to_keep value can also be used instead. Example:

To schedule PurgeAccumulator externally through the Windows Task Scheduler rather than running it as a background task from bb-tasks.xml/bb-tasks.windows.xml, first make sure the task is commented out in bb-tasks.xml/bb-tasks.windows.xml through regular HTML-style comment tags, then schedule the blackboard\tools\admin\PurgeAccumulator.bat script with these same options in Task Scheduler for 1:00 am every day.

On Unix you need to comment out the task in bb-tasks.xml. You can then either schedule the above command in bbuser's crontab (preferred, but requires bbuser to be among the users allowed to use cron), or use an "su" construct to schedule it in the root crontab file (the example here schedules it at 1:00 am, the normal time when PurgeAccumulator would run from bb-tasks.xml as well):

Normally the task is scheduled from the config/bb-tasks.xml task scheduler configuration file as (Unix syntax).

Verify PurgeAccumulator operation

The PurgeAccumulator logs to logs/update-tools/update-tool-log.txt on the appserver that ran it, so you can see each night whether it ran and where. It logs each step of its operation, and if it skips synching with the stats database due to licensing, that is correct on a Basic system.

To verify whether the tool ran correctly and completely, log into the database as BB_BB60_STATS user, and run the following queries:

We recommend to run each query below separately rather than in a single batch.

On Oracle, a faster alternative (assuming the table is analyzed regularly) is:

SELECT NUM_ROWS,LAST_ANALYZED FROM ALL_TABLES WHERE TABLE_NAME='ACTIVITY_ACCUMULATOR' AND OWNER='BB_BB60_STATS';

Under correct operation the first section (production DB) should show the oldest record (min(timestamp)) having a date from 180 days ago (or whatever number of days to keep you have configured), and the newest record in the second section (stats DB) should be from today, meaning the sync has completed. If the sync finished but the purge failed, you would still have records until today in the stats DB, but would have too-old records in the production DB.