Hi!
I have difficulties to estimate the correct slope from noisy data.
This is the code to generate the noisy data:
Needs["LinearRegression`"];
slope = 1.0;
sigma = 0.5;
xrange = 1.0;
SeedRandom[123]; (* initialize random generator *)
rnd = {#, #*slope + RandomReal[NormalDistribution[0, sigma]]} &;
(* generate 2000 data points *)
data = Table[
rnd[RandomReal[NormalDistribution[0, xrange/3.0]]], {2000}];
subset = Take[data, 8];
ListPlot[subset, PlotRange -> {{-3, 3}, {-3, 3}},
PlotStyle -> PointSize[.025]]
fit = Regress[subset, x, x, IncludeConstant -> False,
RegressionReport -> {SummaryReport, ParameterCITable}]
The correct slope is exactly 1. As the data is quite noisy, the CI of
the slope is very big. The estimated slope is far to big (1.947). If I
use more data points, the estimation gets better; I could also use a
wider x-range, to get a better estimate for the slope. However, I'm
quite limited in the x-range, so using a wider x-range is no option
for me.
I could check the RSquared for significance (If[Abs[r*Sqrt[n - 2]/
Sqrt[1 - r^2]] >=
Quantile[StudentTDistribution[n - 2], 1 - 0.05], r, 0] (*
significance of 95% *)). I this case, it is significant.
Is there any other way to get a good estimate for the slope, without
using too many data points?
(Keywords: fit, regression, slope, noisy, rsquared, limited data)