~ Knowing is not enough; we must apply. Willing is not enough; we must do.

Category Archives: Statistics

My friends, we are not going to discuss ancient Chinese philosophy, but mathematics. In this era of social networking and Big Data, every data scientist wants more connections in the social network to crunch because more connections (i.e. more edges in the graph) mean more information, right? So today’s quiz is which graph in above contains more information? Continue reading →

I have been developing a comprehensive machine learning library of advanced algorithms, called SMILE (Statistical Machine Intelligence and Learning Engine), for several years with my spare time. Today I am very pleased to announce that SMILE is now available on GitHub under Apache 2.0 license. SMILE is self contained and requires only the standard Java library. With advanced data structures and learning algorithms, SMILE achieves the state of the art of performance.

In statistics, the method of maximum likelihood is widely used to estimate an unobservable population parameter that maximizes the log-likelihood function

where the observations are independently drawn from the distribution parameterized by . The Expectation-Maximization (EM) algorithm is a general approach to iteratively compute the maximum-likelihood estimates when the observations can be viewed as incomplete data and one assumes the existence of additional but missing data corresponding to . The observations together with the missing data are called complete data. Continue reading →