Navigation

The Stingray Reader tackles four fundamental issues in
processing a file:

How are the bytes organized? What is the Physical Format?

Haw are the data objects organized? What is the Logical Layout?

What do the bytes mean? What is the Conceptual Content?

How can we assure ourselves that our applications will work with this file?

The problem we have is that the schema is not always bound
to a given file nor is the schema clearly bound to an application program.
There are two examples of this separation between schema and content:

We might have a spreadsheet where there aren’t even column titles.

We might have a pure data file (for example from a legacy COBOL program)
which is described by a separate schema.

One goal of good software is to cope reasonably well with variability
of user-supplied inputs. Providing data by spreadsheet is
often the most desirable choice for users. In some cases, it’s the
only acceptable choice. Since spreadsheets are tweaked manually, they
may not have a simple, fixed schema or logical layout.

A workbook (the container of individual sheets)
can be encoded in any of a number of physical
formats: XLS, CSV, XLSX, ODS to name a few. We would like our applications
to be independent of these physical formats. We’d like to focus
on the logical layout.

Data supplied in the form of a workbook can suffer from numerous data quality issues.
We need to be assured that a file actually conforms to a required
schema.

A COBOL file parallels a workbook sheet in several ways. It also introduces
some unique complications. We’d like to provide a suite of tools that work
well with common spreadsheets as well as COBOL files, allowing some
uniformity in processing various kinds of data.