Improving public understanding of probability and risk with special emphasis on its application to the law. Why Bayes theorem and Bayesian networks are needed

Tuesday, 24 March 2015

The problem with big data and machine learning

The advent of ‘big data’, coupled with fancy statistical machine learning techniques, is increasingly seducing people to believe that new insights and better predictions can be achieved in a wide
range of important applications, without relying on the input of domain experts. The applications range from learning how to retain customers through to learning what makes people susceptible to particular diseases. I have written before about the dangers of this kind of 'learning' from data alone (no matter how 'big' the data is).

Contrary to the narrative being sold by the big data community, if you want accurate predictions and improved, decision-making then, invariably, you need to incorporate human knowledge and judgment. This enables you to build rational causal models based on 'smart' data. The main objections to using human knowledge - that it is subjective and difficult to acquire - are, of course, key drivers of the big data movement. But this movement underestimates the typically very high costs of collecting, managing and analysing big data. So, the sub-optimal outputs you get from pure machine learning do not even come cheap.

To clarify the dangers of relying on big data and machine learning, and to show how smart data and causal modelling (using Bayesian networks) gives you better results, I have collected together the following short stories and examples:

7 comments:

The advent of big data coupled with fancy statistic machine learning techniques is increasingly seducing people to believe that new insights very high costs of collecting managing and analysing big data the sub-optimal outputs you get from pure machine learning.Its nice article in related giving more information post!.

The advent of big data coupled with fancy statistic machine learning techniques is increasingly seducing people to believe that new insights very high costs of collecting managing and analysing big data the sub-optimal outputs you get from pure machine learning.Its nice article in related giving more information post!.

Hello admin, your technical information related to big data is very informative. I have gathered some new information about big data through your blog. Its really useful for me. Keep blogging. Thanks.Big Data Training in Chennai

Introduction to Nebosh International General CertificateThe Nebosh International General Certificate in Occupational Health and Safety is best suited for managers, supervisors and workers in an organisation for making day – to- day decisions at workplace that need a broad understanding of health and safety issues and be able to manage risks effectively. Over 45,000 people having achieved this qualification since it was introduced in 2004.The NEBOSH International General Certificate is also suitable for those embarking on a career in health and safety, providing a valuable foundation for further professional study (such as the NEBOSH International Diploma in Occupational Health and Safety).

People should adhere to the rules of the health and safety at work, home or everywhere. After all, they have the duty of care for people along with all the visitors to their office environment.__________________Health Safety Certification | Fire Safety Courses

Martin Neil

About Me

Norman's experience in risk assessment covers application domains such as legal reasoning (he has been an expert witness in major criminal and civil cases), software project risk, medical decision-making, vehicle reliability, football prediction, transport systems, and financial services. Norman has published over 130 articles and 5 books on these subjects