The "merge" function gives an unexpected result when I use the all.x=TRUE option if there are columns in the data frame that are matrices (2 columns).
If have 1 data.frame with all possible combinations of 2 factors, and merge with another data.frame with 1 combination missing then the missing values in the merged data are filled in with erroneous values.
See the code below:
set.seed(2012)
a.factor <- as.factor(rep(letters[1:2],2))
b.factor <- as.factor(rep(c(1:2),each=2))
y <- as.matrix(cbind(as.character(a.factor),b.factor))
data1 <- data.frame(a.factor,b.factor,y=NA)
data1$y <- y
data1 <- subset(data1,!((a.factor=="b")&(b.factor==2))) # Delete row
factorial.data <- data.frame(a.factor,b.factor,row=1:length(b.factor))
print("Merged Data Frames")
print(data1)
print(factorial.data)
merged.data <- merge(factorial.data,data1,by=c("a.factor","b.factor"),all.x=TRUE)
print("Strange Result with incorrectly filled in data in row 4")
print(merged.data)
data2 <- data.frame(a.factor,b.factor,y=y)
data2 <- subset(data2,!((a.factor=="b")&(b.factor==2))) # Delete row
merged.data2 <- merge(factorial.data,data2,by=c("a.factor","b.factor"),all.x=TRUE)
print("Expected Result with properly missing data in row 6")
print(merged.data2)
# Maybe it's just me, but this surprising result led to some errors later on.