Monte Carlo Study

12

Pitman asymptotic relative efficiencies

13 References

14

ABSTRACTR. E. Tarone from National Cancer Institute, Bethesda, Maryland;derive the tests for the goodness of fit of the binomial distributionusing C() procedure of Neyman (1959), which areasymptotically optimal against generalized binomial alternativesproposed by Altham (1978) and Kupper & Haseman (1978).Before coming to the article I have explain about binomial andrelated distributions. I have reproduced key parts of the article, ifsomebody interested in detail of the article then he is advice tosee references at the end page of the report.

BINOMIAL AND RELATED DISTRIBUTIONS

Bernoulli trialA Bernoulli trial (named after James Bernoulli, one of the founding fathers ofprobability theory) is an experiment with two, and only two possibleoutcomes [2]. For example: female or male, life or death, Head or Tail andsuccess or failure etc. A sequence of Bernoulli trials occur when a Bernoulliexperiment is performed several independent times so that the probability ofsuccess, say p, remains the same from trial to trial.

Bernoulli distributionA random variable X is defined to have a Bernoulli distribution if the discretedensity function of X is given by

Var[x] = E[x2]-{E[x]} 2= o2.q + 12.p - p2= pq

1-p is often denoted by q. [1]

Mx (t) = E[etx] =

etx p x ( 1 p)1xx=0

= q+pet

Example 1: out of millions of instant lottery tickets, suppose that 20% arewinners. If five such tickets are purchased, (0, 0, 0, 1, 0) is a possibleobserved sequence in which the fourth ticket is a winner and the other fourare losers. Assuming independence among winning and losing tickets, theprobability of this outcome is (0.8) (0.8) (0.8) (0.2) (0.8) = (0.2) (0.8) 4 [5]In a sequence of Bernoulli trials, we are often interested in the total numberof successes and not in the order of their occurrence. If we let the randomvariable X equal the number of observed successes in n Bernoulli trials, thepossible values of X are 0, 1, 2, . . ., n. if x successes occur, where x=0,1,2,, n, then n-x failures occur. The number of ways selection x positions forthe x successes in the n trials is

n!(nx)= x !(nx)!

since the trials are

independent and since the probabilities of success and failure on each trialare, respectively, p and q=1-p, the probability of each of these ways is px (1p) n-x. Thus f(x), the p.m.f of X, is the sum of the probabilities of thesemutually exclusively events, that isf ( x )= n p x (1 p)n x for x=0,1,2, nx

()

(nx )

These probabilities are called binomial probabilities, and the random

variable X is said to have a binomial distribution.A binomial experiment satisfies the following properties:1. A Bernoulli experiment is performed n times.2. The trials are independent.3. The probability of success on each trial is a constant p; the probabilityof failure is q=1-p.4. The random variable X equals the number of successes in the n trials.A binomial distribution is denoted by the symbol b (n, p) and we say that thedistribution of X is b (n, p). The constants n and p are called the parametersof the binomial distribution. Thus if we say that the distribution of X is b (10,

1/5), we mean that X is the number of successes in a random sample of size

n=10 from a Binomial distribution with p=1/5.The binomial distribution derived its name from the fact that the (n+1) termsin the binomial expansion of (q + p) n correspond to the various values of b(x;n, p) for x=0, 1, 2. . . n. That is

(n0 ) q +(n1 ) pq +(n2 ) p q

n

(q+ p) n=

n1

n2

+ + n pnn

()

Since (q + p) =1, we see that

b ( x ; n , p )=1x=0

, a condition that must be hold

for any probability distribution.

Example 2: if we want to find the probability of obtaining exactly three 2s ifan ordinary die is tossed 4 times; then the probability is:

b (4,

6 =

16

56

4 3

()

The mean, variance and moment generating function of binomial distribution

Now taking first derivative of Mx (t) = Mx (t) = npet (pet + q) n-1

Mx (t) = n (n-1) (pet) 2(pet + q) n-2 + npet

Hence E[x] =Mx (0) = np and

Example 3: If the mgf of a random variable X is M (t) =

2 1 t+ e3 3)5

then X has a binomial distribution with n = 5 and p = 1/3; that is, the pmf ofX is

Here = np = 5 /3 and 2 = np (1 p) = 10/ 9.

Note: Binomial distribution reduces to the Bernoulli distribution when n=1.Sometimes the Bernoulli distribution is called the point binomial.Example 4: Let the random variable Y be equal to the number of successesthroughout n independent repetitions of a random experiment withprobability p of success. That is, Y is b (n, p). The ratio Y/n is called therelative frequency of success. Now recall Chebyshevs Inequality i.e. P (|x-|2) 2 for all >0.

Applying this result, we have for all > 0 that

YVar ( )Yp(1 p)nP (| n p ) =2

n 2

Now, for every fixed > 0, the right-hand member of the preceding inequality is close to zerofor sufficiently large n. That is

Since this is true for every fixed > 0, we see, in a certain sense that therelative frequency of success is for large values of n, close to the probabilityof p of success [3].

Example 5: Let the independent random variables X1, X2, X3 have the samecdf F(x). Let Y be the middle value of X1, X2, X3. To determine the cdf of Y ,say FY (y) = P(Y y), we note that Y y if and only if at least two of therandom variables X1, X2, X3 are less than or equal to y. Let us say that the ithtrial is a success if Xi y, i = 1, 2, 3; here each trial has the probability ofsuccess F(y). In this terminology, FY (y) = P(Y y) is then the probability ofat least two successes in three independent trials. ThusFY(y) =

32

()

y1F [)+2[F ( y )]

[F ( y )] .

If F(x) is a continuous cdf so that the pdf of X is F(x) =f(x), then the pdf of YisFY(y) = FY(y) =6[F(y)] [1-F(y)] f(y). [4]MULTINOMIAL DISTRIBUTIONRecall that in order for an experiment to be binomial; two outcomes arerequired for each trial. But if each trial in an experiment has more than twooutcomes, a distribution called the multinomial distribution must be used.For example, a survey might require the responses of approve,disapprove, or no opinion. In another situation, a person may have achoice of one of five activities for Friday night, such as a movie, dinner,baseball game, play, or party. Since these situations have more than twopossible outcomes for each trial, the binomial distribution cannot be used tocompute probabilities.If X consists of events E1, E2, E3, . . . , Ek, which have correspondingprobabilities p1, p2, p3, . . . , pk of occurring, and X1 is the number of times E1

will occur, X2 is the number of times E2 will occur,X3 is the number of times E3will occur, etc., then the probability that X will occur isP ( X )=

n!. p x p x p xkX1 ! X2! X3! Xk ! 1 21

Where X1 + X2 + X3 + . . . + Xk = n and p1 +p2 +p3 + . . . + pk = 1

For an illustration purpose let a box contains four white balls, three red balls,and three blue balls. A ball is selected at random, and its color is writtendown. It is replaced each time and let we want to find the probability that iffive balls are selected, two are white, two are red, and one is blue.

Correlated binomial (CB) distribution [4]

This distribution is derived on the assumption that the binary responses of

the fetuses in a litter are not mutually independent. This idea is due toBahadur (1961). Retaining only the first order correlation between theresponses and denoting as the covariance between the binary responses ofany two fetuses, the random variable X is such that

where p is the probability that the fetus is abnormal. Note that for the aboveequation to be a valid probability distribution, a data-dependent bound forthe parameters has to be imposed; see Kupper and Haseman (1978). It canbe shown that the expectation and variance of the correlated binomialdistribution are np and np (1-p) + n(n-1), respectively. Thus, the correlatedbinomial distribution is a generalization of the binomial distribution, the CBdistribution becomes the binomial distribution when =0. Altham (1978)derived a further two-parameter generalized binomial distribution, namely,the multiplicative generalized binomial (MB) distribution.

Altham-multiplicative binomial distribution

Neyman (1959) introduces the C () test with the consideration that

hypotheses testing problems in applied research often involve severalnuisance parameters. In these composite testing problems, most powerfultests do not exist, motivating search for an optimal test procedure that yieldsthe highest power among the class of tests obtaining the same size.Neymans locally asymptotically optimality result for the C() test employsregularity conditions inherited from the conditions used by Cramer (1946) forshowing consistency of MLE and some further restrictions on the testingfunction to allow for replacing the unknown nuisance parameters by its nconsistent estimators. It is the confluence of these Cramer conditions andthe maintained significance level that gives the name to the C () test

TESTING THE GOODNESS OF FIT OF THE BINOMIAL

DISTRIBUTION*R. E. Tarone from National Cancer Institute, Bethesda, Maryland; derive thetests for the goodness of fit of the binomial distribution using C() procedureof Neyman (1959), which are asymptotically optimal against generalizedbinomial alternatives proposed by Altham (1978) and Kupper & Haseman(1978) [5].

The C () test for correlated binomial alternatives

Consider an experiment in which the responses take the form of proportionsand let the ith response be given by pi=xi/ni for i=1, ... , M. Under thecorrelated binomial model the log likelihood function is :M

i=1

i=1

L=K + { xi log p+(nix i) log q }+ log [ 1+

x ni p ) 2+ x i ( 2 p1 )ni p2 }]2 2 {( i2p q

Where q = I-p and K is a constant involving only the observations. A test of

the goodness of fit of the binomial distribution is obtained by testing the nullhypothesis: Ho: =0 in the presence of nuisance parameter p. Moran (1970)demonstrated that for such problems the C () tests proposed by Neyman(1959) are asymptotically equivalent to tests using maximum likelihood

11

estimators. In order to derive the C () test statistic for Ho: = 0 we need

the following partial derivatives of L evaluated at = 0:

Under the null hypothesis, the xi are independent binomial random variables,and hence it follows from (2) that E {S2 (p)} =0. Neyman (1959) has shownthat when E {S2 (p)} =0 the null hypothesis Ho: =0 can be tested using the^statistic S1 ( p) , where ^p is a root-n consistent estimator of p (Moran,1970). Substituting the consistent estimator

^p=

xi ni

into (1) and defining

px ini ^ni

S 2(^p) =S=,wefindthatC()teststatisticisgivenbyS.

Since E {S2 (p)} =0, the variance of S ( ^p ) is given by E {S3 (p)} where theexpectation is taken under Ho: =0. From (3) it follows that E {S3 (p)} =

ni (ni1)2 p2 q2

. Substituting

^p

for p in the variance expression we find that

under the null hypothesis, the statistic

will have an asymptotic chi-squared distribution with one degree of freedom.

The statistic X2c is the C () test statistic for homogeneity of proportionswhich is asymptotically optimal against correlated binomial alternatives.The binomial variance test for homogeneity is based on the statistic

12

Which has an asymptotic chi-squared distribution with M - 1 degrees of

freedom when b= 0. It is clear from the above expressions that for the casein which ni = n for all i, the C () test statistic S is equivalent to the variancetest statisticX2v.

C () test for beta binomial alternatives

The beta-binomial distribution is a mixture of binomial distributions whichhas oftenbeen utilized as an alternative to the binomial distribution. Under the betabinomial modelthe log likelihood function is given by

Where K is a constant involving only the observations. A test of the goodness

of fit of thebinomial distribution is obtained by testing the null hypothesis Ho: = 0.The derivation ofthe C () test statistic using the beta-binomial model is similar to thederivation for the correlated binomial model, and the optimal statistic againis found to be the statistic Sderived in the last section. Note, however, that in the beta-binomial modelthe parameter cannot take negative values. The alternative hypothesis isnecessarily one sided, and hence theC () test is the one-sided test based on the statistic the C () test is the onesided test based on the statistic cannot take negative values. The alternativehypothesis is necessarily one sided, and hence the C () test is the one-sidedtest based on the statistic

Under the null hypothesis Ho: = 0, the statistic Z will have an asymptoticstandard normaldistribution.

The C () test for Althams multiplicative alternatives

13

The multiplicative generalization of the binomial distribution provides an alternative for whichthe correlated binomial C () test is not asymptotically optimal. The log likelihood function forthe multiplicative generalization of the binomial model is

inix The C () test for Ho: =1 is based on the statistic x I () . Note that unlike the correlatedR= binomial C () statistic, R is not equivalent to the variance test statistic in the case ni = n for all i.

Will have an asymptotic chi-squared distribution with one degree of freedom. The test based onX2m is asymptotically optimal against alternatives given by the multiplicative generalization ofthe binomial mode

Monte Carlo Study & Asymptotic Relative Efficiencies

In order to compare the different tests of the goodness of fit of the binomialdistribution weconsider the treatment group data of Kupper & Haseman (1978, p. 75). Theobserved proportions were 0/5,2/5,1/7,0/8,2/8,3/8,0/9,4/9,1/10and 6/10.Thevariance test gives X2v = 19.03 and P = 0.025,the correlated binomial C()test gives X2c= 6.63and P = 0. 01. Thus for this example, the correlatedbinomial C() test is more sensitive to the departure of the observedproportions from a binomial distribution than the other tests considered.

14

In order to investigate the small sample distribution of the C () tests under

the null hypothesis, a Monte Carlo experiment was performed. Ten binomialproportions were randomly generated using the unequal sample sizes fromthe above example. For each pseudorandom sample of 10 proportions the C() statistics X2c and the variance test statistic X2v were calculated andcompared to the 100%, 500 and 1% points of their asymptotic nulldistributions. The empirical significance levels based on 1500 replications areshown in Table 1 for under lying binomial probabilities of 0.10, 0.25and 0.50.For the cases considered, the empirical significance levels for the correlatedbinomial C () statistic are significantly lower than the nominal level for the500 and 10% critical values. The empirical significance levels for the 1%critical value show no consistent pattern.