Vincent Warmerdam
added a comment - 09/Sep/15 22:57 - edited It just occured to me that there is a very similar error with machine learning. In R you can pass a date/timestamp into a model and it will treat it as if it were a numeric.
> df <- data.frame(d = as.Date('2014-01-01') + 1:100, r = runif(100) + 0.5 * 1:100)
> lm(r ~ d, data = df)
Call:
lm(formula = r ~ d, data = df)
Coefficients:
(Intercept) d
-7994.9971 0.4975
I'm not sure if Spark wants to have similar support but it may be something to keep in mind; the problem seems similar.

Shivaram Venkataraman
added a comment - 09/Sep/15 20:05 Thanks for the report – I think this is a problem in the Spark SQL layer (so it should also happen in Scala, Python as well) as we don't support summarizing DateType fields
cc Reynold Xin Davies Liu