Answer:
the organisation of the source provides a selection of
obvious types of information such as name, address, marital
status, occupation, wages and a categorisation of whether or
not the worker was ‘dependent’ or ‘independent’. There is
also a column for ‘No’, a unique identifier for each
individual, which is a concept that is crucially important to
databases, and one we will refer to again.

Further Information
about this Manuscript

Page from the Strike register of the striking matchworkers at
Bryant & May, London, in July 1888. Available at
Wikimedia (click here for the original - accessed
25/03/2011).

Of course the columns provided within the source provide a
clear indication of the kinds of information that can be
obtained, which makes the historian’s life a little simpler
(although this will not often be case). Some reasonably
sophisticated forms of social analysis could be performed
with even as simple a source as this with relative ease by
employing a database, both in terms of record linkage and in
terms of drawing out various statistical patterns.

But even this relatively simple source contains a number of
issues that will complicate the transition from information
to data – and keep in mind the fact that the sample is only
two pages of a larger manuscript.

As with the Census enumerator’s listing above, the strike
register, whilst more or less in a tabular arrangement, also
contains information at the top of the page with is outside
the rectangular structure of the source, and which would need
accommodating somehow in the database. Additionally there are
a number of classic source-based problems which will hinder
the design of the database, and which we will return to in
Section F. Alterations, marginalia, notations,
abbreviations, the sudden inclusion of a value that does not
seem to fit the stated classification scheme of the source,
illegible text, double values entered for some columns and so
on, all serve to complicate matters.