Wednesday, 31 August 2011

Wilks lambda & Discriminant analysis and its Weakness

Today’s class was about discrimination analysis in SPSS. As per my understanding from class and some research on net, here are few things about discrimination analysis and Wilks lambda. Also I have mentions some of problems in Discriminant analysis.

To start with, discriminant analysis is done by creating a new variable that is a combination of the original predictors. This is done in such a way that the differences between the predeﬁned groups, with respect to the new variable, are maximised. Because there are two classes the maximum number of discriminant functions is one and each has an associated eigen value. Larger eigen values are associated with greater class separations and they can also be used to measure the relative importance of discriminant functions in multi-classanalyses. Eigen values can be converted into Wilk’s lambda, a multivariatetest statistic whose value ranges between zero and one. Values close to zero indicate that class means are different while values close to one indicate that the class means are not different. Wilk’s lambda will be one when the means are identical. Wilk’s lambda is equal to 1/(1+λ) for a two-class problem and can beconverted into a chi-square statistic so that a signiﬁcance test can be applied. Wilk’s lambda can be converted into a canonical correlation coefﬁcient fromthe square root of 1- Wilk’s lambda. The canonical correlation is the squareroot of the ratio of the between-groups sum of squares to the total sum of squares. Squared, it is the proportion of the total variability explained by differences between classes. Thus, if all of the variability in the predictors wasa consequence of the group differences the canonical correlation would beone, while if none of the variability was due to group differences thecanonical correlation would be zero. An assumption of a discriminant analysis is that there is no evidence of a difference between the covariance matrices of the classes. There are formalsigniﬁcance tests for this assumption (e.g. Box’s M) but they are not very robust.In particular they are generally thought to be too powerful, i.e. the nullhypothesis is rejected even when there are minor differences, and Box’s M isalso susceptible to deviations from multivariate normality (another assumption).

Each discriminant function can be summarised by three sets of coefﬁcients:

Inconsistent -. Because of the dissimilar processes involved in Fisher's discriminant analysis and Mahalanobis's discriminant analysisapproaches to discriminant analysis, the resulting solutions are not alike.

Unintuitive- Due to the complexity of discriminant analysis's mechanics, it is an unwieldy tool for all those but the mathematically-savvy. Similar tools, such as multiple regression, are just as flexible as discriminant analysis without carrying along the intricacy and specificity associated with discriminant analysis.

Prediction- The prediction available through discriminant analysis is not true prediction. discriminant analysis can only tell a researcher the likely grouping of a certain data point; it cannot tell the researcher other properties of the data point or how likely it is the data point is a member of the classified group.

Inspite of all the weaknesses Discrimination analysis has wide rage of applications in all the fields.