This document is taken from a presentation Mark O'Neil delivered at the 2017 Teaching & Learning Conference in Milan, Italy. You can download the presentation and script files at the end of this document.

Complete Refresh vs. Store/Delete

Complete Refresh

Explicit Store and Delete

You pass all your data – every single record – into a black box.

Learn chews on data and takes actions based on data differences.

Complete Refresh takes a file and manages the deletes from a set of data based on differences in the data provided and what is stored in Learn. This has substantial overhead as the data must be compared and determinations regarding differences (update), whether data is new (store), or whether the data is in Learn, but not in the data file (delete).

Complete Refresh is often easier from an SIS extraction as you pull all the relevant data and create the feed file.

You pass in only the data to act on– decreasing data scale

You pass in explicit data sets – and Learn only adds/updates records

Using Store or Delete gives you explicit control over granular sets of data. This is substantially faster as only the data present in the file is acted upon.

Store and Delete files are more difficult to generate as the SIS needs to process the data pulls to create files containing only the differences between pulls to generate the feed files. However, the efficiency gained in processing often makes Store and Delete the more worthwhile method.

Running sedRemoveUpdates removes the updates (which which were matched in the diff). Where sedRemoveUpdates.sh is:

#!/bin/bash

while read line; do

matchMe=$(cut -d'|' -f1 <<< $line)

echo $matchMe

searchResult=$(grep $matchMe < sortedNew.txt)

if [ -z "$searchResult" ]; then

#echo "not found"

else

sed -i.bak "/$matchMe/d" ./DISABLE.txt

fi

done < DISABLE.txt

exit 0

now DISABLED.txt contains only those EXTERNAL_PERSON_KEYS that are not in the sortedNew.txt file.

We can then use a similar process reversing the Diff to find Updated users to create a smaller Store file that reflects only additions and updated lines in the sortedNew.txt by copying only new and updated entries from a slightly altered sortedNew file to a STORE.txt file.