The triangles algorithm uses an extra array that is unneeded

Description

The triangles algorithm currently creates two arrays to calculate the accumulation array of the work to be done: work and accum_work. This was presumably done because there were problems getting parallelization to happen if the the work array were not precalculated. However, it can be done in a single loop without the work array. This will reduce memory usage from 2n + 1 to n + 1 for this step.

Backed out the change for this in r3309. Turns out without this array, a compiler bug happens that skews the results. Jon is writing up a bug report to send to Cray about this. For now, I will reopen this bug.