The blog from the Department of Computer Science at the University of Surrey

Dr Norman Poh gave a talk at Workshop on Big Data: Modelling, Estimation and Selection

This two-day workshop (https://indico.math.cnrs.fr/event/830/timetable/#20160609) was organised by CNRS and took place from 9-10 June 2016 in Lille, France, gathered researchers from the industry and the academia working in the area of big data. While the talks on the first day were tutorials targeting the general audience; on the second day, talks were focused around technical and mathematical details such as alternative methods to improve gradient-descend type of optimization. Dr Norman Poh’s talk was arranged at the beginning of the second day in order to link the high-level tutorial on the first day and the technical talks which followed after that.

Dr Poh’s talk was entitled ‘What could we learn from millions of patient records? A machine-learning perspective’. The talked provided the healthcare context, justifying why healthcare records are a big data problem and motivated the need to develop novel machine-learning algorithms that are more adapted to modelling the temporal dynamics, potentially over the life course of a patient, defined on a large concept space, which is spanned by hundreds of thousands of clinical concepts. In addition, the population denominator, which is in the order of millions of patients, thus qualifies the problem of modelling healthcare records as ‘big data’.

Talk abstract

Increasing healthcare cost coupled with an ageing population in both developing and developed worlds means that it is important to understand disease demographic profiles in order to better optimize resources for quality health and care. By using Chronic Kidney Disease (CKD) as a case study, I will present challenges that are related to understanding, modelling and predicting the progression of CKD; and how machine learning techniques can be used to solve them. Examples include calibration of estimated Glomerular Filtration Rate (eGFR), modelling of eGFR, automatic selection clinically relevant variables, and non-linear dimensionality reduction for data discovery.