Re: statistical comparison of parameters from two applications of NonlinearModelFit

To: mathgroup at smc.vnet.net

Subject: [mg116104] Re: statistical comparison of parameters from two applications of NonlinearModelFit

From: Ray Koopman <koopman at sfu.ca>

Date: Wed, 2 Feb 2011 06:08:31 -0500 (EST)

References: <ii8s8s$2o1$1@smc.vnet.net>

On Feb 1, 3:54 am, dantimatter <goo... at dantimatter.com> wrote:
> Hello Everyone,
>
> First, I must apologize if the following questions are silly;
> I am neither a statistician nor a particularly good Mathematica
> user (although I do occasionally pretend to be one). And now,
> on the question:
>
> I've got some data that I try to fit with NonlinearModelFit:
>
> time1 = {0, 100, 200, 300, 400, 500, 600, 700, 800};
> data1 = {1, 0.92, 0.79, 0.81, 0.81, 0.78, 0.72, 0.75, 0.68};
>
> nlm1= NonlinearModelFit[Transpose[{time1, data1}],
> a1*Exp[-b1*x] + c1, {{a1, .5}, {b1, 0.001}, {c1, 0.5}}, x,
> ConfidenceLevel -> .95]
>
> with results:
>
> In[11] := nlm1["BestFitParameters"]
> nlm1["ParameterConfidenceIntervals"]
> nlm1["ParameterErrors"]
>
> Out[11] = {a1 -> 0.295575, b1 -> 0.00334141, c1 -> 0.696761}
> Out[12] = {{0.179807, 0.411344}, {-0.000284765, 0.00696758},
> {0.582507, 0.811014}}
> Out[13] = {0.0473121, 0.00148194, 0.0466929}
>
> I then do the same thing on a second data set:
>
> time2 = {0, 100, 200, 300, 400, 500};
> data2 = {0.8, 0.73, 0.65, 0.61, 0.58, 0.57};
>
> nlm2= NonlinearModelFit[Transpose[{time2, data2}],
> a2*Exp[-b2*x] + c2, {{a2, .5}, {b2, 0.001}, {c2, 0.5}}, x,
> ConfidenceLevel -> .95]
>
> In[15]:= nlm2["BestFitParameters"]
> nlm2["ParameterConfidenceIntervals"]
> nlm2["ParameterErrors"]
>
> Out[15] = {a2 -> 0.287834, b2 -> 0.00362383, c2 -> 0.516796}
> Out[16] = {{0.204254, 0.371413}, {0.00121124, 0.00603641},
> {0.427572, 0.60602}}
> Out[17] = {0.0262627, 0.000758091, 0.0280362}
>
> Based on our understanding of the real system, we actually expect
> b2 <= b1, but this not what the fitting tells us (of course, there's
> a bit of error in the extracted parameters).
>
> So what I'd really like to know is:
>
> (1) how can I make sense of these "ParameterConfidenceIntervals"
> and "ParameterErrors" (how are they determined, and what is meant
> by them, beyond their brief description in
> http://reference.wolfram.com/mathematica/tutorial/StatisticalModelAnalysis.html )?
>
> and
>
> (2) is there a way to establish some kind of upper limit on the
> detectability of the difference between b2 and b1?
>
> Any thoughts you might have would be most appreciated.
>
> Many thanks!
The parameter estimates are the values of the parameters (a, b, c)
that minimize the sum of squared differences between the observed
data and the values predicted from the model.
Each ParameterError is an estimate of the standard error of
the corresponding parameter estimate.
Each ParameterConficenceInterval is an interval that you are
(100*ConfidenceLevel)% certain contains the true value of the
corresponding parameter *if the model is complete (all the variables
that matter are included), the form of the model function is correct,
and the distributional assumptions about the errors (normality,
homoscedasticity, independence) are correct*.
Each confidence level is constructed as
estimated_parameter +/- ( estimated_standard_error *
Quantile[StudentTDistribution[degrees_of_freedom],
(1 + ConfidenceLevel)/2] ),
where degrees of freedom =
#_of_data_points - #_of_estimated_parameters.
Each estimated standard error is the square root of the product of two
factors -- one that is common to all the estimates, and one that is
specific to each estimate.
The common factor is the mean square error of approximation:
the sum of squares that the parameter estimates minimize,
divided by the degrees of freedom.
The specific factors are more complicated, so I'll say only that:
1. They do not directly reflect the size of the data-approximation
errors;
2. They are strongly tied to the values of the independent variable
(your 'time');
3. Ceteris paribus, they are inversely proportional to the number of
data points.
Now we're ready to talk about detecting a difference between b1 and
b2. The formal test of the hypothesis that there is no difference
is a student t: (b1 - b2)/Sqrt[ASE[b1]^2 + ASE[b2]^2] = -.170 .
The degrees of freedom are data dependent, but in your case the t
is so small that it will never be significant, no matter what the
degrees of freedom are.
To achieve significance, the t would have to be at least 12 times
its current value, which means you would need 144 times as many
sample points as you currently have. (Note: I've cut a lot of
corners here. A proper analysis would be much messier but would
lead to the same conclusion -- you need a whole lot more data,
on the order of 200 times as much as you have shown us.)
Finally, here's some code that should reproduce some of the
NonlinearModelFit output:
r1 = FindFit[Transpose@{time1,data1}, a1*Exp[-b1*x]+c1,
{{a1,.5},{b1,.001},{c1,.5}}, x]
f1 = a1*Exp[-b1*time1]+c1;
e1 = f1 - data1;
df1 = Length@data1 - 3;
var1 = (e1.e1/.r1)/df1 (* error variance *)
cov1 = var1*Inverse[Transpose at #.#]&@Outer[D[##]&,f1,{a1,b1,c1}]/.r1;
ase1 = Sqrt@Diagonal@cov1 (* ASEs *)
cov1/Outer[Times,ase1,ase1] (* correlations *)
({a1,b1,c1}/.r1) + Outer[Times,ase1, (* CIs *)
{-1,1}]*Quantile[StudentTDistribution[df1],.975]