Monday, 11 September 2017

One
of the greatest impediments to the use of probabilistic reasoning in
legal arguments is the difficulty in agreeing on an appropriate prior
probability that the defendant is guilty. The 'innocent until proven
guilty' assumption technically means a prior probability of 0 - a figure
that (by Bayesian reasoning) can never be overturned no matter how much
evidence follows. Some have suggested the logical equivalent of 1/N where N is the number of people in the world. But this probability is clearly too low as N
includes too many who could not physically have committed the crime. On
the other hand the often suggested prior 0.5 is too high as it stacks
the odds too much against the defendant.

New work - presented at the 2017 International Conference on
Artificial Intelligence and the Law (ICAIL 2017) - shows that, in a large class of cases, it is
possible to arrive at a realistic prior that is also as consistent as
possible with the legal notion of ‘innocent until proven guilty’. The
approach is based first on identifying the 'smallest' time and location
from the actual crime scene within which the defendant was definitely
present and then estimating the number of people - other than the
suspect - who were also within this time/area. If there were n
people in total, then before any other evidence is considered each
person, including the suspect, has an equal prior probability 1/n of having carried out the crime.

The
method applies to cases where we assume a crime has definitely taken
place and that it was committed by one person against one other person
(e.g. murder, assault, robbery). The work considers both the practical
and legal implications of the approach and demonstrates how the prior
probability is naturally incorporated into a generic Bayesian network
model that allows us to integrate other evidence about the case.

Full details:

Fenton, N. E., Lagnado, D. A., Dahlman, C., & Neil, M. (2017). "The
Opportunity Prior: A Simple and Practical Solution to the Prior
Probability Problem for Legal Cases". In International Conference on
Artificial Intelligence and the Law (ICAIL 2017). Published by ACM.
Pre-publication draft.

Thursday, 7 September 2017

From July to December 2016 the Isaac Newton Institute Programme on Probability and Statistics in Forensic Science in Cambridge hosted many of the world's leading figures from the law, statistics and forensics with a mixture of
academics (including mathematicians and legal scholar), forensic
practitioners, and practicing lawyers (including judges and eminent
QCs). Videos of many of the seminars and presentation from the Programme can be seen here.

A key output of the Programme has now been published.
It is a very simple set of twelve guiding principles and
recommendations for dealing with quantitative evidence in criminal law
for the use of statisticians, forensic scientists and legal
professionals. The layout consists of one principle per page as shown
below.

Monday, 14 August 2017

This blog has reported many times previously (see links below) about problems with using the likelihood ratio. Recall that the likelihood ratio is commonly used as a measure of the probative value of some evidence E for a hypothesis H; it is defined as the probability of E given H divided by the probability of E given not H.

There is especially great confusion in its use where we have data for the probability of H given E rather than for the probability of E given H.
Look at the somewhat confusing argument here in relation
to the offence of 'child grooming' which is taken directly from the book
McLoughlin, P. “Easy Meat: Inside Britain’s Grooming Gang Scandal.” (2016):

Given
the sensitive nature of the grooming gangs story in the UK and the
increasing number of convictions, it is important to get the maths
right. The McLoughlin book is the most thoroughly researched work on the
subject. What the author of the book is attempting to determine is the
likelihood ratio of the evidence E with respect to the hypothesis H where:

H: “Offence is committed by a Muslim” (so not H means “Offence is committed by a non-Muslim”)

E: “Offence is child grooming”

In this case, the population data cited by McLoughlin provides our priors P(H)=0.05 and, hence, P(not H)=0.95. But we also have the data on child grooming convictions that gives us P(H | E)=0.9 and, hence, P(not H | E)=0.1.

What we do NOT have here is direct data on either P(E|H) or P(E|not H). However, we can still use Bayes theorem to calculate the likelihood ratio since:

So, in the example we get:

Hence, while the method described in the book is flawed, the conclusion arrived at is (almost) correct.

Friday, 11 August 2017

Constructing
an effective and complete Bayesian network (BN) for individual cases
that involve multiple related pieces of evidence and hypotheses requires
a major investment of effort. Hence, generic BNs have been developed
for common situations that only require adapting the underlying
probabilities. These so called `idioms’ make it practically possible to
build and use BNs in casework without spending unacceptable amounts of
time constructing the network. However, in some situations both the
probability tables and the structure of the network depend on case
specific details.

Examples of such situations are where there are multiple linked crimes. In (deZoete2015)
a BN structure was produced for evaluating evidence in cases where a
person is suspected of being the offender in multiple possibly linked
crimes. In (deZoete2017)
this work has been expanded to cover situations with multiple offenders
for possibly linked crimes. Although the papers present a methodology
of constructing such BNs, the workload associated with constructing them
together with the possibility of making mistakes in conditional
probability tables, still present unnecessary difficulties for potential
users.

As part of the BAYES KNOWLEDGE
project, we have developed online accessible GUIs that allow the user
to select the parameters that reflect their crime linkage situation
(both for one and double offender crime linkage cases). The associated
BN is then automatically generated according to the structures described
in (deZoete2015) and (deZoete2017). It is presented visually in the GUI
and is available as download for the user as a .net file which can be
opened in AgenaRisk or another BN software package. These applications
both serve as a tool for those interested or working with crime linkage
problems and as a proof of principle of the added value of such GUIs to
make BNs accessible by removing the effort of constructing every network
from scratch.

The GUIs are available from the `DEMO’ tab on the BAYES KNOWLEDGE
website and is based on R code, a statistical programming language.
This automated workflow can reduce the workload for, in this case,
forensic statisticians and increase the mutual understanding between
researchers and legal professionals.

Thursday, 29 June 2017

In
2015 the British government announced major tax reforms for individual
landlords that will be in full effect in tax year 2020/21, being
introduced gradually after April 2017. The new reforms and regulations
have received much media attention as there has been widespread belief
that they were sufficiently skewed against landlords that they could
signal the end of the Buy-To-Let (BTL) investment era in the UK.

Research by Anthony Constantinou and Norman Fenton of Queen Mary University of London, has now been published
that provides the first comprehensive evaluation of the impact of the
reforms on the London BTL property market. The results use a novel model
(based on revolutionary new work in an AI method called Bayesian networks)
that captures multiple uncertainties and allows investors to assess the
impact of various factors of interest on their BTL investment, such as
changes in interest rates, capital and rental growth. Additionally, the
model allows for portfolio risk management through intervention between
time steps, such as the effects of different scenarios of re-mortgaging.

The
results show that, over a 10-year period, the overall
return-on-investment (ROI) will be reduced under the new tax measures,
but that the ROI remains good assuming a common BTL London profile.
However, there are major differences depending on the investor strategy.
For example, for risk-averse investors who choose not to expand their
portfolio, the reforms are expected to have only a marginal negative
impact, with the overall ROI reducing from 301% under the old
regulations to 290% under the new (-3.7%), and this loss comes
exclusively from a decrease in net profits from rental income (-32.2%).
However, the impact on risk-seeking investors who aim to expand their
property portfolio through leveraging is much more significant, since
the new tax reforms are projected to decrease ROI from 941% to 590%
(-37.3%), over the same 10-year period.

The impact on
net profits also poses substantial risks for loss-making returns
excluding capital gains, especially in the case of rising interest
rates. While this makes it less desirable or even non-viable for some to
continue being a landlord, based on the current status of all factors
taken into consideration for simulation, investment prospects are still
likely to remain good within a reasonable range of interest rate and
capital growth rate variations. Further, the results also indicate that
the recent trend of property prices in London increasing faster than
rents will not continue for much longer; either capital growth rates
will have to decrease, rental growth rates will have to increase, or we
shall observe a combination of the two events.

Monday, 6 March 2017

When I was presenting the BBC documentary Climate Changes by Numbers
and had to explain the idea of a statistical 'attribution study', I
used the analogy of determining which factors most affected the
performance of Premiership football teams year on year. Because I had to
do it in a hurry I and my colleague Dr Anthony Constantinou did a very
crude analysis which focused on a very small number of factors and
showed, unsurprisingly, that turnover (i.e. mainly spend on transfer and
wages) had the most impact of these.

We weren't happy with the quality of the study and decided to undertake a much more comprehensive analysis as part of the BAYES-KNOWLEDGE project. This project is all about improved decision-making and risk assessment using a probabilistic technique called Bayesian Networks.
In particular, the main objective of the project is to produce
useful/accurate predictions and assessments in situations where there is
not a lot of data available. In such situations the current fad of 'big
data' methods using machine learning techniques do not work; instead we
use 'smart-data' - a method that combines the limited data available
with expert causal knowledge and real-world ‘facts’. The idea of
predicting Premiership teams' long term performance and identifying the
key factors explaining changes was a perfect opportunity to both develop
and validate the BAYES-KNOWLEDGE method, especially as we had
previously done extensive work in predicting individual premiership
match results (see links at bottom).

The results of the study have now been published in one of the premier international AI journals Knowledge Based Systems.

The
Bayesian Network model in the paper enables us to predict, before a
season starts, the total league points a team is expected to accumulate
throughout the season (each team plays 38 games in a season with three
points per win and one per draw). The model results compare very
favourably against a number of other relevant and different types of
models, including some which use far more data. As hoped for the results
also provide a novel and comprehensive attribution study of the factors
most affecting performance (measured in terms of impact on actual
points gained/lost per season). For example, although unsurprisingly,
the largest improvements in performance result from massive increases in
spending on new players (an 8.49 points gain), an even greater decrease
(up to 16.52 points) results from involvement in the European
competitions (especially the Europa League) for teams that have previous
little experience in such competitions. Also, something that was very
surprising and that possibly confounds bookies - and gives punters good
potential for exploiting - is that promoted teams generate (on average)
a staggering increase in performance of 8.34 points, relative to the
relegated team they are replacing. The results in the study also partly
address/explain the widely accepted 'favourite-longshot bias' observed
in bookies odds.

Thursday, 9 February 2017

Causal Bayesian networks are at the heart of a major new collaborative research project led by Australian University Monash
- funded by the United States' Intelligence Advanced Research Projects
Activity (IARPA). The objective is to help intelligence analysts assess
the value of their information. IARPA was set up following the failure
of the US intelligence agencies to properly assess the correct levels of
threat posed by Al Qaeda in 2001 and Iraq in 2003.

The chief investigator at Monash, Kevin Korb, said in an interview in the Australian:

"..quantitative
rather than qualitative methods were crucial in judging the value of
intelligence.... more quantitative approaches could have helped contain
the ebola epidemic by making authorities appreciate the scale of the
problem months earlier. They could also build a better assessment of the
likelihood of events like gunfire between vessels in the South China
Sea, a substantial devaluation of the Venezuelan currency or a new
presidential aspirant in Egypt."

Norman Fenton and Martin Neil (both of Agena and Queen Mary University of London)
will be working on the project along with colleagues such as David
Lagnado and Ulrike Hahn at UCL. AgenaRisk will be used throughout the
project as the Bayesian network platform.

Martin Neil

About Me

Norman's experience in risk assessment covers application domains such as legal reasoning (he has been an expert witness in major criminal and civil cases), software project risk, medical decision-making, vehicle reliability, football prediction, transport systems, and financial services. Norman has published over 130 articles and 5 books on these subjects