For instantiating interpretational schemes, collections of
spontaneous and non-proofread natural language texts are a
prerequisite. Note that this process is another way of looking at how
our writer model is to be constructed.

To give a rough idea, the following collections are required:

A collection of texts from native speakers, possibly subdivided
by:

Text type;

Educational level.

A collection of texts from non-native speakers, possibly
subdivided by:

Text type;

Language (native language);

Educational level.

For a broad and general purpose instantiation of 1 to be applied to a
grammar checker for native speakers, we would need a balanced corpus
of texts to generate the rates for the frequency of phenomena. This
is to be used for setting the weights and constructing the rating
scales of the interpretational scheme for the measures. Along the same
lines, we would get a similar instantiation for 2, but for non-native
speaker grammar checkers for a language.

There is another function of such text collections; in the process of
constructing the test suites they feed into these by providing
examples and tokens of erroneous inputs to different categories. In
fact, they may lead to definition of categories of error types.