Marker Lecture - 2017

Main Content

Jianqing Fan, Frederick L. Moore '18 Professor of Finance, Professor of Statistics, and Professor of Operations Research and Financial Engineering at the Princeton University, will present the 2017 Russell Marker Lectures in Statistical Sciences on October 5 and 6 at Penn State University. The free public lectures are sponsored by the Penn State Eberly College of Science.

The series includes a lecture intended for a general audience, titled "Challenges on Analysis of Big Data" which will be held at 4:30 p.m. on Thursday, October 5, in 110 Business Building on the Penn State University Park Campus. Fan also will give a specialized lecture titled "Distributed Estimation of Principal Eigenspaces" at 10:00 a.m. on Friday, October 6, in 201 Thomas Building on the Penn State University Park Campus.

Challenges on Analysis of Big Data

October 5th, 2017 - 4:30pm - 110 Business Building

Big Data arise from almost all aspects of human endeavors, from frontiers of scientific research to societal developments. They hold great promise for the discovery of heterogeneity and the search for personalized treatments. They also allow us to find weak patterns in presence of large individual variations. Salient features of Big Data include heterogeneity, noise accumulation, spurious correlations, incidental endogeneity, measurement errors, computational cost, data storage, retrieval, and communications. These have huge impact on the system and analysis and should be seriously considered in the development of statistical procedures. We will address several of these issues in this talk from distributed inference and robust analysis and illustrate the importance of robustness by using financial and economic data.To view the lecture see: Challenges on Analysis of Big Data

Distributed Estimation of Principal Eigenspaces

October 6th, 2017 - 10:00am - 201 Thomas Building

Principal component analysis (PCA) is fundamental to statistical machine learning. It extracts latent principal factors that contribute to the most variation of the data. When data are stored across multiple machines, however, communication cost can prohibit the computation of PCA in a central location and distributed algorithms for PCA are thus needed. This paper proposes and studies a distributed PCA algorithm: each node machine computes the top eigenvectors and transmits them to the central server; the central server then aggregates the information from all the node machines and conducts a PCA based on the aggregated information. We investigate the bias and variance for the resulting distributed estimator of the top $K$ eigenvectors. In particular, we show that for distributions with symmetric innovation, the distributed PCA is "unbiased''. We derive the rate of convergence for distributed PCA estimators, which depends explicitly on the effective rank of covariance, eigen-gap, and the number of machines. We show that when the number of machines is not unreasonably large, the distributed PCA performs as well as the whole sample PCA, even without full access of whole data. The theoretical results are verified by an extensive simulation study. We also extend our analysis to the heterogeneous case where the population covariance matrices are different across local machines but share similar top eigen-structures.

This talk is based on a joint work with Dong Wang, Kaizheng Wang and Ziwei Zhu at Princeton University.

Jianqing Fan

Jianqing Fan is Frederick L. Moore '18 Professor of Finance, Professor of Statistics, and Professor of Operations Research and Financial Engineering at Princeton University. Fan directs the Committee of Statistical Studies at Princeton University since 2006. Fan is a co-editor of Journal of Econometrics, and he was the co-editor of Annals of Statistics (2004-2006) and an editor of Probability Theory and Related Fields (2003-2005), Econometrical Journal (2007-2012). He was the past president of the Institute of Mathematical Statistics (2006-2009), and past president of the International Chinese Statistical Association (2008-2010).

Fan has coauthored three highly-regarded books on Local Polynomial Modeling (1996), Nonlinear time series: Parametric and Nonparametric Methods (2003) and The Elements of Financial Econometrics (2015). He authored or coauthored over 200 articles on statistics, financial econometrics, computational biology, and statistical machine learning. He has been consistently ranked as a top 10 highly-cited mathematical scientist since the existence of such a ranking. He has been recognized by the 2000 COPSS Presidents' Award, invited speaker at The 2006 International Congress for Mathematicians, Humboldt Research Award for Lifetime Achievement in 2006, Morningside Gold Medal of Applied Mathematics in 2007, Guggenheim Fellow in 2009, Pao-Lu Hsu Prize (2013), and Guy Medal in Silver (2014), and the election to Academician from Academia Sinica (2012) and to the fellow of American Association for the Advancement of Science, Institute of Mathematical Statistics, and American Statistical Association.