Conference on High-Dimensional Statistics

High-Dimensional Statistics has grown out of modern research activities in diverse fields such as science, technology, and business, aided by powerful computing. It encompasses several emerging fields in statistics such as high-dimensional inference, dimension reduction, data mining, machine learning, and bioinformatics.

Zoran Obradovic

Title of Talk: Learning from High Dimensional Partially Observed Temporal Data

Abstract: We will first show how to efficiently approximate the Markov Blanket that consists of multiple dependent variables as to find a minimum subset of the most informative variables for predictive modeling in high dimensional data. Our method, based on Hilbert-Schmidt criterion in a kernel-induced space, allows removal of both irrelevant and redundant variables in high dimensional classification and regression problems. We will then describe how to avoid the data imputation step when learning from partial observations. For this purpose, we formulate a convex optimization problem where the objective function is maximization of each instance’s uncertainty margin in its own relevant subspace. Our method was shown to outperform the alternatives when there is a large fraction of missing values in high dimensional data. Finally, we will present an extension of our margin-based feature selection method to high dimensional temporal data where a fixed-point gradient descent method is proposed to solve the formulated objective function to learn the optimal feature weights. The experimental results on temporal microarray data provide evidence that the proposed method can identify more informative features than the alternatives that flatten the temporal data.

Presented results are obtained in collaboration with Q. Lou while he was a Ph.D. student at my lab.

Brief Bio:

Zoran Obradovic is professor of Computer and Information Sciences and the director of the Center for Data Analytics and Biomedical Informatics at Temple University in Philadelphia. His data analytics work is published in more than 260 articles and is cited more than 10,000 times (H-index 41 and I10-index 86). Obradovic is the executive editor at the journal on Statistical Analysis and Data Mining, which is the official publication of the American Statistical Association (ASA) and is currently an editorial board member at eleven journals. He is general co- chair for 2013 and 2014 for SIAM International Conference on Data Mining and was the program and/or track chair at many data mining and biomedical informatics conference.