Hey! Been doing a lot of lurking and not much posting recently so thought I'd pitch in with a question. Big Data and machine learning seems to be pretty sexy nowadays, and despite efforts to the contrary my research seems to be heading that way too. So I need to actually learn this stuff. The theory seems pretty comprehensible, but I can't find any decent courses online into actually doing any big data work. If you know of anything - especially using R - let me know please!
Posted by TheMoMan (Member # 1659) on January 16, 2016, 10:21:

Stibbons, I have some questions about big data and tracking out-breaks. With the advent of faster data could smaller pockets of infections be tracked with better precision?
Posted by Stibbons (Member # 2515) on January 27, 2016, 12:11:

My work is all in the intensive care unit - we have more data coming off our patients than we know what to do with, but it's only available to us because we're in an ICU, and need it to guide clinical management. Yes, there are refinements we can make to track spread of e.g. drug resistance microbes in this environment quicker, but the core problem of getting the data isn't there.

Contrast this with trying to track an outbreak in the community. How do we get information on wellness of patients? It relies firstly on them presenting to a healthcare provider, which they may not until it's too late for their info to be useful. And even if they do, how do we get that data from the provider to those doing the data analysis? Given the fear of Big Data that the public has extends into their health records, it's notoriously difficult to access information which may be useful (see the Care.Data issues we had in the UK). This is one of the reasons some illnesses have a legal responsibility on the diagnosing clinician to report, because otherwise we have no right to that patient's data and can't track them.
Posted by TheMoMan (Member # 1659) on January 27, 2016, 13:48:

Stibbons, I remember looking at Health Pro. magazines in the late sixties showing clusters of Cancers, that were located around heavy manufacturing areas. I knew of these areas and still went to work for those companies because of the wages that they paid. I got my money and Pension and left, only to be stricken with Prostate Cancer. Causation or effect dubious.
Posted by dragonman97 (Member # 780) on January 27, 2016, 20:32:

Stibbons,

Do you have an academic affiliation, or is this strictly a clinical facility?

The rules on patient privacy can make analysis beyond individual care tricky unless you have really good protocols for anonymization. Even so, it's probably best to have an expert opinion on how to pursue that. I certainly think it's cool to look for these patterns and make more meaningful use of the data, but I'm acutely aware of the privacy issues. If a hospital's data gets out...it's bad. As a friend has said - if your credit card info. is leaked, you can change it; if your medical history is leaked, you can't change it. (And screwy as it may seem, it could be valuable to some particularly nefarious characters.)
Posted by Stibbons (Member # 2515) on January 29, 2016, 03:41:

Dman, bit of both. I'm using freely available anonymised US data currently to build models, then applying them to real-time data coming from my own patients in critical care. I'm luckily not struggling to acquire the data, I'm just lost in a wasteland of partial understanding of methodology!
Posted by quantumfluff (Member # 450) on January 31, 2016, 15:57:

I would also start by looking at any available history about Google Flu Trends and how they ended up being a able to see outbreaks before the CDC.
Posted by Rednivek (Member # 1148) on March 16, 2016, 00:11:

For a really good intro to Machine Learning, check out the free course from Caltech. (156). Sometimes its interactive, but you can also check it out yourself on Youtube: