warnings about factor levels dropped from predict.glm

warnings about factor levels dropped from predict.glm

I am helping a student with some logistic regression analyses and we are
getting some strange inconsistencies regarding a warning about factor
levels being dropped when running predict.glm(, newdata = ournewdata) on
the logistic regression model object. We have checked multiple times that
the factor levels have been defined similarly on both data sets (one used
to estimate model and the newdata) and that values occur for all factor
levels in both data sets. When I run these commands on my version of R
(3.2.5) on a Windows 7 OS I do not get the warnings. When the student runs
them on her version of R (not sure what number hers is) on her Mac, she
gets these warnings constantly. I've checked some records manually by
doing the algebra and the predict.glm() function is working correctly
incorporating the factor levels on my machine. Any thoughts???

> I am helping a student with some logistic regression analyses and we are
> getting some strange inconsistencies regarding a warning about factor
> levels being dropped when running predict.glm(, newdata = ournewdata) on
> the logistic regression model object. We have checked multiple times that
> the factor levels have been defined similarly on both data sets (one used
> to estimate model and the newdata) and that values occur for all factor
> levels in both data sets. When I run these commands on my version of R
> (3.2.5) on a Windows 7 OS I do not get the warnings. When the student runs
> them on her version of R (not sure what number hers is) on her Mac, she
> gets these warnings constantly. I've checked some records manually by
> doing the algebra and the predict.glm() function is working correctly
> incorporating the factor levels on my machine. Any thoughts???
>
> Brian
>
> Brian S. Cade, PhD
>
> U. S. Geological Survey
> Fort Collins Science Center
> 2150 Centre Ave., Bldg. C
> Fort Collins, CO 80526-8818
>
> email: [hidden email] <[hidden email]>
> tel: 970 226-9326
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Re: warnings about factor levels dropped from predict.glm

I suspect that the warning may be coming from stats::model.frame.default(), with text along the lines of:

"contrasts dropped from factor YOUR.FACTOR.NAME due to missing levels"

You might want to see if the student has a ~/.Rprofile file that has some modified default options regarding contrasts, etc.

Check to see if there is some change/difference in the structure of the data frames in use, specifically any contrast related attributes on the relevant data frame columns that are different on the two systems. See ?str.

Have them open an R session from the macOS terminal and run R using:

R --vanilla

to see if you get the same errors on their system. If not, it suggests that perhaps their .Rprofile file has something non-default in it, and/or perhaps there is a .RData file in their working directory that has some saved workspace objects causing a conflict, as that file will be loaded by default with a new R session.

Regards,

Marc Schwartz

> On Dec 5, 2017, at 3:37 PM, Bert Gunter <[hidden email]> wrote:
>
> A guess (treat accordingly):
>
> Different BLAS versions are in use on the two different machines/versions.
> In one, near singularities are handled, and in the other they are not,
> percolating up to warnings at the R level.
>
> You can check this by seeing whether the estimated fit is the same on the 2
> machines. If so, ignore the above.
>
> -- Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Tue, Dec 5, 2017 at 12:17 PM, Cade, Brian <[hidden email]> wrote:
>
>> I am helping a student with some logistic regression analyses and we are
>> getting some strange inconsistencies regarding a warning about factor
>> levels being dropped when running predict.glm(, newdata = ournewdata) on
>> the logistic regression model object. We have checked multiple times that
>> the factor levels have been defined similarly on both data sets (one used
>> to estimate model and the newdata) and that values occur for all factor
>> levels in both data sets. When I run these commands on my version of R
>> (3.2.5) on a Windows 7 OS I do not get the warnings. When the student runs
>> them on her version of R (not sure what number hers is) on her Mac, she
>> gets these warnings constantly. I've checked some records manually by
>> doing the algebra and the predict.glm() function is working correctly
>> incorporating the factor levels on my machine. Any thoughts???
>>
>> Brian
>>
>> Brian S. Cade, PhD
>>
>> U. S. Geological Survey
>> Fort Collins Science Center
>> 2150 Centre Ave., Bldg. C
>> Fort Collins, CO 80526-8818
>>
>> email: [hidden email] <[hidden email]>
>> tel: 970 226-9326