Sunday, May 17, 2015

The paper helicopter experiment

The paper helicopter is one of the devices to explain about design of experiments. The aim is to create the longest flying paper helicopter by means of experimental design.
Paper helicopters are a nice example, because they are cheap to make, easy to test landing time and sufficient variables to make it non obvious.
Rather than make and measure my own helicopters, I decided to use data from the internet. In this post I use data from williamghunter.net and http://www.rose-hulman.edu. There is more data on the internet, but these two are fairly similar. Both use a fractional factorial design of 16 runs and they have the same variables. However, a quick check showed that these were different results and, very important, the aliasing structure was different.

Data

Data were taken from the above given locations. Rather than using the coded units, the data was converted to sizes in cm. Time to land was converted to seconds.
Since these were separate experiments, it has to be assumed that they used different paper, different heights to drop helicopters from. It even seems, that different ways were found to attach a paperclip to the helicopters.

Simple analysis

To confirm the data an analysis on coded units was performed. These results were same as given by the websites, results not shown here. My own analysis starts with real world units and is by regression. Disadvantage or real world units is that one cannot compare the size of the effects, however, given the designs used, the t-statistic can be used for this purpose.
The first data set shows WingLength and BodyLength to have the largest effects. Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.92798 0.54903 3.512 0.009839 ** PaperTyperegular1 -0.12500 0.13726 -0.911 0.392730 WingLength 0.17435 0.03088 5.646 0.000777 ***BodyLength -0.08999 0.03088 -2.914 0.022524 * BodyWidth 0.01312 0.07205 0.182 0.860634 PaperClipYes 0.05000 0.13726 0.364 0.726403 FoldYes -0.10000 0.13726 -0.729 0.489918 TapedBodyYes -0.15000 0.13726 -1.093 0.310638 TapedWingYes 0.17500 0.13726 1.275 0.242999

The second data set shows WingLength, PaperClip and PaperType to have the largest effects.

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 0.73200 0.21737 3.368 0.01196 *

PaperTyperegular2 0.28200 0.06211 4.541 0.00267 **

WingLength 0.16654 0.01223 13.622 2.7e-06 ***

BodyLength -0.02126 0.01630 -1.304 0.23340

BodyWidth -0.03307 0.04890 -0.676 0.52058

PaperClipYes -0.35700 0.06211 -5.748 0.00070 ***

FoldYes 0.04500 0.06211 0.725 0.49222

TapedBodyYes -0.14700 0.06211 -2.367 0.04983 *

TapedWingYes 0.06600 0.06211 1.063 0.32320

It seems then, that the two experiments show somewhat different effects. WingLength is certainly important. BodyLength maybe. Regarding paper, both have regular paper, but one has bond paper and the other construction paper. It is not difficult to imagine these are quite different.

Combined analysis

The combination analysis is programmed in Jags. To capture a different falling distance, a Mul parameter is used, which defines a multiplicative effect between the two experiments. In addition, both sets have their own measurement error. There are four types of paper, two from each data set, and three levels of paperclip, no paperclip assumed same for both experiments. In addition to the parameters given earlier, residuals are estimated, in order to have some idea about the quality of fit.

Model with interactions

Adding the most obvious interactions, such as WingLength*TapedBody did not really provide a suitable answer. Indeed, large residuals at observations 4 and 13, which are at opposite sides in the fractional factorial design, can not be resolved with one interaction.

Hence I proceeded with adding all two way interactions. Since this was expected to result in a model without clear estimates, all interactions had a strong prior; mean was 0 and precision (tau) was 1000. This model was subsequently reduced by giving the interactions which clearly differed from 0 a lesser precision while interactions which where clearly zero were removed. During this process the parameter Fold was removed from the parameter set. Finally, quadratic effects were added. There is one additional parameter, other, it has no function in the model, but tells what the properties of the prior for the interactions are. Parameters with a standard deviation less than other have information added from the data.

jmodel <- function() {

for (i in 1:n) {

premul[i] <- (test[i]==1)+Mul*(test[i]==2)

mu[i] <- premul[i] * (

WL*WingLength[i]+

BL*BodyLength[i] +

PT[PaperType[i]] +

BW*BodyWidth[i] +

PC[PaperClip[i]] +

TB*TapedBody[i]+

TW*TapedWing[i]+

WLBW*WingLength[i]*BodyWidth[i]+

WLPC[1]*WingLength[i]*(PaperClip[i]==2)+

WLPC[2]*WingLength[i]*(PaperClip[i]==3)+

BLPT[1]*BodyLength[i]*(PaperType[i]==2)+

BLPT[2]*BodyLength[i]*(PaperType[i]==3)+

BLPC[1]*BodyLength[i]*(PaperClip[i]==2)+

BLPC[2]*BodyLength[i]*(PaperClip[i]==3)+

BWPC[1]*BodyWidth[i]*(PaperClip[i]==2)+

BWPC[2]*BodyWidth[i]*(PaperClip[i]==3) +

WLWL*WingLength[i]*WingLength[i]+

BLBL*BodyLength[i]*BodyLength[i]+

BWBW*BodyWidth[i]*BodyWidth[i]

)

Time[i] ~ dnorm(mu[i],tau[test[i]])

residual[i] <- Time[i]-mu[i]

}

for (i in 1:2) {

tau[i] <- pow(StDev[i],-2)

StDev[i] ~dunif(0,3)

WLPC[i] ~dnorm(0,1)

BLPT[i] ~dnorm(0,1)

BLPC[i] ~dnorm(0,1)

BWPC[i] ~dnorm(0,1)

}

for (i in 1:3) {

PT[i] ~ dnorm(PTM,tauPT)

}

tauPT <- pow(sdPT,-2)

sdPT ~dunif(0,3)

PTM ~dnorm(0,0.01)

WL ~dnorm(0,0.01)

BL ~dnorm(0,0.01)

BW ~dnorm(0,0.01)

PC[1] <- 0

PC[2]~dnorm(0,0.01)

PC[3]~dnorm(0,0.01)

TB ~dnorm(0,0.01)

TW ~dnorm(0,0.01)

WLBW~dnorm(0,1)

WLTW~dnorm(0,1)

WLWL~dnorm(0,1)

BLBL~dnorm(0,1)

BWBW~dnorm(0,1)

other~dnorm(0,1)

Mul ~ dnorm(1,1) %_% I(0,2)

}

Inference for Bugs model at "C:/Users/Kees/AppData/Local/Temp/Rtmp4o0rhh/model16f472b05364.txt", fit using jags,

DIC is an estimate of expected predictive error (lower deviance is better).

Model discussion

This model does not have the big residuals. In addition it seems that some parameters, e.g. WLWL and WLBW have small mean values and small standard deviations. To me this suggests that they are indeed estimated and found to be close to 0. After all, if the data contained no information, their standard deviation would be similar to the prior, which is much larger, as seen from the other parameter.
The quadratic effects were added to allow detection of a maximum. There is not much presence of these effects, except perhaps in WingLength (parameter WLWL).
For descriptive purposes, I will leave these parameters in. However, for predictive purposes, it may be better to remove them or shrink them closer to zero.
Given the complex way in which the parameters are chosen, it is very well possible that a different model would be better. In hindsight, I might have used the BMA function to do a more thorough selection. Thus the model needs to be validated some more. Since I found two additional data sets on line, these might be used for this purpose.

No comments:

Post a Comment

Wiekvoet

Wiekvoet is about R, JAGS, STAN, and any data I have interest in. Topics range from sensometrics, statistics, chemometrics and biostatistics. For comments or suggestions please email me at wiekvoet at xs4all dot nl.