Awards

Winner of Subchallenge in Prostate Cancer DREAM Challenge

The goal of the Prostate Cancer DREAM challenge is to improve the prediction of survival and toxicity of docetaxel treatment in patients with metastatic castrate resistant prostate cancer (mCRPC).
The primary benefit of this challenge will be to establish new quantitative benchmarks for prognostic modeling in mCRPC, with a potential impact for clinical decision making and ultimately understanding the mechanism of disease progression.

Our implementation using a large heterogeneous ensemble of different models was able to secure the win in subchallenge 1b, where the task was to predict the exact time to event (days till death) for all patients in the evaluation set. A detailed description of our solution is available at the challenge's website.

Current Projects

Widespread use of electronic health records (EHR) led to vast amount of medical data being collected. The data is characterized by large sample size, heterogeneity of variables, unstructured information, missing information, and time-dependent variables, to name a few. These facts render commonly used statistical tools to perform analysis inadequate and require the development of sophisticated algorithms to overcome these challenges. For instance, the set of variables recorded for each patient can naturally be decomposed into groups, known as views. Most machine learning algorithms ignore this multi-view relationship. Instead, they can either be trained on each view separately or on a concatenation of all views to form a single view. Considering these relationships recently came to attention of the research community that proposed co-training, multiple kernel learning and subspace learning to address this problem. Although existing methods show promising results in their respective tasks, it is difficult to apply them to real-world clinical data, where highly heterogeneous features (continuous, categorial, ordinal) and missing values are common, because not all tests can be performed on all patients. As a result, multi-view learning algorithms have rarely been applied to medical problems.