Re: [aa-ftt] testing excel workbooks [1 Attachment]

Hi Ward, For changes to the scoring mechanism, it might be worthwhile to create test material data to more easily demonstrate intentional or unintentional

Message 1 of 9
, Mar 15, 2012

0 Attachment

Hi Ward,

For changes to the scoring mechanism, it might be worthwhile to create test material data to more easily demonstrate intentional or unintentional changes to the score. Something like:

create data for 2 fictitious materials, where the measurements are identical except for one measurement. Add the materials to the old and the new spreadsheet, and check that the relative score is the same (the material with the higher score in the old has the higher score in the new)

create pairs or sets of entries for materials that differ in other measurements, or sets of measurements. Check the ordering, %difference, etc of the scores in old and new.

if there are thresholds in the formulas for certain values, create materials with values on either side of the thresholds.

and so on.

-Dave

On Thursday, March 15, 2012 at 1:58 PM, Ward Cunningham wrote:

Friends -- Thank you for your suggestions. I've condensed them into this suggestion:

1. Write a program that compares the old and new workbooks

2. Classify differences into four categories:

a. small numeric changes

b. large numeric changes

c. string changes

d. formula changes

3. Tally the number of each kind of change according to workbook, sheet, and column

4. Inspect areas of dense change -- confirm that change is reasonable based on new functionality

5. Inspect areas of unexpected change (or lack of change) -- these could be typos

I've explained this with a few more sentences and included your responses in the attached report.

The workbooks are part of the Nike Material Sustainability Index (NMSI) to be released as open-data soon. They consist of collected scientific measurements for many hundreds of materials. This is supplemented by a complex scoring mechanism which allows materials to be presented on a single dimensioned scale. The scientific reviewers have asked for modifications to a small parts of the scoring mechanism.

My own Smallest Federated Wiki offers an alternative way to distribute and revise both the measurements and the scoring mechanism. I will use the NMSI workbooks as the known good when completing my own work. I will also consider how confirmation, validation and evaluation mechanisms can be built in to my new wiki to make this sort of data publishing reliable.

Best regards. -- Ward

On Mar 13, 2012, at 5:40 PM, Ward Cunningham wrote:

Friends -- How would you test a complex dataset? The data is organized into three separate excel workbooks, one of which does lots of detailed computation with fundamental data. The other two provide various level of rollup and summarization of the detailed computations.

The computational workbook has dozens of sheets with many dozens of columns and many hundreds of rows.

Probably half the cells have formulas. Of these, most are simple calculations. However there are no shortage of pivot-tables and vlookups (treading part of one sheet as a little db to be queried by a formula elsewhere).

There is no independent known-correct version of the data/computations.

One suggestion has been to use excel debugging tools to get a feel for the structure of the workbooks and then write vbscrip to verify that all cells follow that structure. This reminds me of using static analysis tools to look for suspicious code in conventional programs.

There has already been an independent review of the data that included a sensitivity analysis of some of the calculations. This review has asked for several changes which is how we find ourselves making sweeping modifications at the last minute, a risky necessity.

Thanks and best regards. -- Ward

Attachments:

- checking-workbooks.pdf

Markus Gaertner

Hi Ward, what I am missing in your question is the domain. How does the business use this kind of data? What s the use case behind it? Best Markus ... --

Message 2 of 9
, Mar 15, 2012

0 Attachment

Hi Ward,

what I am missing in your question is the domain. How does the
business use this kind of data? What's the use case behind it?

Best
Markus

On Wed, Mar 14, 2012 at 01:40, Ward Cunningham <ward@...> wrote:
> Friends -- How would you test a complex dataset? The data is organized into three separate excel workbooks, one of which does lots of detailed computation with fundamental data. The other two provide various level of rollup and summarization of the detailed computations.
>
> The computational workbook has dozens of sheets with many dozens of columns and many hundreds of rows.
>
> Probably half the cells have formulas. Of these, most are simple calculations. However there are no shortage of pivot-tables and vlookups (treading part of one sheet as a little db to be queried by a formula elsewhere).
>
> There is no independent known-correct version of the data/computations.
>
> One suggestion has been to use excel debugging tools to get a feel for the structure of the workbooks and then write vbscrip to verify that all cells follow that structure. This reminds me of using static analysis tools to look for suspicious code in conventional programs.
>
> There has already been an independent review of the data that included a sensitivity analysis of some of the calculations. This review has asked for several changes which is how we find ourselves making sweeping modifications at the last minute, a risky necessity.
>
> Thanks and best regards. -- Ward
>
> ------------------------------------
>
> Yahoo! Groups Links
>
>
>