4.4.2.1.

Incorporating Scientific Knowledge into Function Selection

Choose Functions Whose Properties Match the Process

Incorporating scientific knowledge into selection of the function used
in a process model is clearly critical to the success of the model.
When a scientific theory describing the mechanics of a physical system
can provide a complete functional form for the process, then that type
of function makes an ideal starting point for model development. There
are many cases, however, for which there is incomplete scientific
information available. In these cases it is considerably less clear
how to specify a functional form to initiate the modeling process. A
practical approach is to choose the simplest possible functions
that have properties ascribed to the process.

Example: Concrete Strength Versus Curing Time

For example, if you are modeling concrete strength as a function of curing
time, scientific knowledge of the process indicates that the strength will
increase rapidly at first, but then level off as the hydration reaction
progresses and the reactants are converted to their new physical form.
The leveling off of the strength occurs because the speed of the reaction
slows down as the reactants are converted and unreacted materials are less
likely to be in proximity all of the time. In theory, the
reaction will actually stop altogether when the reactants are
fully hydrated and are completely consumed. However, a full stop of the
reaction is unlikely in reality because there is always some unreacted
material remaining that reacts increasingly slowly. As a result, the
process will approach an asymptote at its final strength.

Polynomial Models for Concrete Strength Deficient

Considering this general scientific information, modeling this process using
a straight line would not reflect the physical aspects of this process very well.
For example, using the straight-line model, the concrete strength would be
predicted to continue increasing at the same rate over its entire lifetime,
though we know that is not how it behaves. The fact that the response variable
in a straight-line model is unbounded as the predictor
variable becomes extreme is another indication that the straight-line model
is not realistic for concrete strength. In fact, this relationship between
the response and predictor as the predictor becomes extreme is common to all
polynomial models, so even a higher-degree polynomial would probably not make a
good model for describing concrete strength. A higher-degree polynomial might
be able to curve toward the data as the strength leveled off, but it would
eventually have to diverge from the data because of its mathematical properties.

A more reasonable function for modeling this process might be a rational
function. A rational function, which is a ratio of two polynomials of the
same predictor variable, approaches an asymptote if the degrees of the
polynomials in the numerator and denominator are the same. It is still a
very simple model, although it is nonlinear
in the unknown parameters. Even if a rational function does not
ultimately prove to fit the data well, it makes a good starting point for the
modeling process because it incorporates the general scientific knowledge
we have of the process, without being overly complicated. Within the family of
rational functions, the simplest model is the
"linear over linear" rational function
$$ y = \frac{\beta_0 + \beta_1 x}{1 + \beta_2 x} + \varepsilon $$
so this would probably be the best model with which to start. If the
linear-over-linear model is not adequate, then the initial fit can be followed up using
a higher-degree rational function, or some other type of model that also has
a horizontal asymptote.

Focus on the Region of Interest

Although the concrete strength example makes a good case for incorporating
scientific knowledge into the model, it is not necessarily a good idea to
force a process model to follow all of the physical properties that the
process must follow. At first glance it seems like incorporating physical
properties into a process model could only improve it; however, incorporating
properties that occur outside the region of interest for a particular
application can actually sacrifice the accuracy of the model "where it counts"
for increased accuracy where it isn't important. As a result, physical properties
should only be incorporated into process models when they directly affect the
process in the range of the data used to fit the model or in the region in
which the model will be used.

Information on Function Shapes

In order to translate general process properties into mathematical functions
whose forms may be useful for model development, it is necessary to know the
different shapes that various mathematical functions can assume. Unfortunately
there is no easy, systematic way to obtain this information. Families of
mathematical functions, like polynomials or rational functions, can
assume quite different shapes that depend on the parameter values that
distinguish one member of the family from another. Because of the wide range of
potential shapes these functions may have, even determining and listing the
general properties of relatively simple families of functions can be
complicated. Section 8 of this chapter
gives some of the properties of a short list of simple functions that are often
useful for process modeling. Another reference that may be useful is the
Handbook of Mathematical Functions by
Abramowitz and Stegun [1964]. The
Digital Library of Mathematical Functions,
an electronic successor to the Handbook of Mathematical Functions that is under
development at NIST, may also be helpful.