I have count data passenger as Y. The data look like this, as many of the values are 1 (about 18%.)

Does it make sense that I take a log of it, and take it as a dependent variable in a generalized linear model with Poisson distribution :

I know the link function is log for Poisson distribution. Did I have a problem to take double log of the Y? The question for me is that my Log(Y) model has a much better goodness-of-fit stat compared to my Y model. I tried some Poisson and Negative Binomial model and they are not fitting very well.

You will have a problem when the counts are 0 which should happen with a Poisson distribution. You probably mean to use something like log(Y+1).
–
Michael ChernickAug 10 '12 at 19:02

@MichaelChernick My question would be, if I take log(Y+1) as the dependent variable, how I can interpret the model.
–
SeenAug 10 '12 at 19:47

1

You think that log(Y) has a Poisson distribution?
–
jkdAug 10 '12 at 20:09

1

What is Y? And what is "double log"? If Y is a count (non-negative integer) then log Y is not a count. Count models often need a negative binomial regression, or some other variation on count regression.
–
Peter Flom♦Aug 10 '12 at 20:25

@PeterFlom So do you think I can apply Poisson model on LogP on the graph?
–
SeenAug 10 '12 at 21:15

3 Answers
3

You can't apply a Poisson model to the variable called logP on your graph because it includes non-integers. A Poisson model can only be used for integers. You can probably still fit it in your software and get interpetable results, but you are not really using a Poisson model.

As @PeterFlom says, if your original variable is a count then log Y is not. If the original variable is a count and a Poisson model does not fit, then try a negative binomial model before you give up and start transforming the variable.

But you can apply a Poisson-like model for non integers. All you need to do is modify the likelihood function to use the log gamma function for the factorial. Whether this is fully sensible and whether it is fully interpretable is another matter, but it quite possibly could be. I agree with trying a negative binomial model where an additional scale parameter is estimated to relax the $E(Y) = Var(Y)$ constraint in Poisson.
–
JoshuaAug 11 '12 at 1:33

1

Yes, this is what I meant by "not really using a Poisson model" if you do that.
–
Peter EllisAug 11 '12 at 6:22

You data was zero-inflated (maybe more than 70% responses were zeros?). If both Poisson regression and negative binomial regression had bad fit, you should try Zero-inflated Poisson or even Zero-inflated negative binomial models. These mixture models have been proven to have better performance than using transformation.

You have given to little information to say much! Assuming you also has some (unstated) regressors $x$, you can use the old trick from before the time of GLM's, to use a usual linear regression after applying the variance-stabilizing transformation, which for the Poisson distribution is $\sqrt{Y}$. That is often a usefull approach!

But note that for count variables, a multiplicative model is often natural, and the usual Poisson (or negative binomial) regression has a multiplicative expectation structure. But, if in your case an additive model is adequate, you can use the mentioned "trick".