Example: Reference data

Handbooks are data for people to use. But databases are for computers, not people, so we need special applications to stand between us and the data. That's why we like to use spreadsheets so much - they let us work with data directly in a form that's more like a handbook. But spreadsheets lack scalability, and the data management or governance properties of databases.

Truenumbers is a game-changer because it stores data in a way that both people and computers can work with directly. If a spreadsheet is like a live table, you can think of Truenumbers as a live reference library, with better governance, traceability and data management than any database, at any scale.

What's in the data dictionary

excerpt from data dictionary table of contents

The FCC model's data dictionary (download here) is an alphabetical list of 82 columns, a few of which are shown at right. The meaning of these columns is implied by their names and their machine type, and augmented by narrative descriptions in the data dictionary document itself.

Descriptive column names are for the benefit of application developers and play no role in system operation. We'll see that there's a lot of implicit knowledge in descriptive names, and that truenumbers make it explicit and useful for indexing and queries.

Sample license data

A sample set of 46 rows (licenses) of data in CSV format (download here) can be opened in a spreadsheet. Zeroing in on two rows, we see the data we will use in this example, a total of 15 columns each for two licenses.

What's the subject?

A truenumber states the value of a property of some subject, and can carry tags as well. So the task is to construct subject, property, value and tags from the given data. In this case, the subject is a license identified by its call sign. In the truenumber language, we might write the subject as FCC/license:FX:K252DNwhich is an example of a Structured Resource Descriptor (SRD). Think of an SRD as a structured name. The slash (/) represents a component relationship, which tells us that a license is a part of, or associated with the FCC. The colon (:) is a refinement, or instance relationship, which tells us that FX is a kind of license, and K252DN is one.

Note that the subject is composed of two data items from the row, FX and K252DN as well as the terms FCC and license. This makes it clear that they are part of the identity of the license.

Separating a thing and its properties

The fourth column is called LIC_NAME which the dictionary tells us is the name of the license holder. So, LIC_NAME refers both to a thing (the licensee) and one of its properties (its name). We can describe the data field as a truenumber like this:

FCC/license:FX:K252DN/licensee has name = "NEW RUSHMORE RADIO, INC."

We added the component licensee to the subject using the slash operator, essentially taking the LIC part of LIC_NAME and making it an explicit subject SRD in the truenumber data. The NAME part of LIC_NAME is used appropriately as the property the truenumber reports.

Telling it like it is

A truenumber can always be transformed into a sentence. The above truenumber is equivalent to

name of the licensee of the FCC's K252DN FX license is "NEW RUSHMORE RADIO, INC."

which could be given as input to the Truenumbers interpreter and produce the same data as the SRD form. It tells a much better story about thevalue than does the spreadsheet.

Truenumber data is no less structured than relational data, and meaning implicit in the column names becomes an explicit part of the data.