A common ‘pragmatic’ approach is to keep doing what you normally do until you hit a snag, and (only) then to reconsider. Whereas Lamarckian evolution would lead to the ‘survival of the fittest’, with everyone adapting to the current niche, tending to yield a homogenous population, Darwinian evolution has survival of the maximal variety of all those who can survive, with characteristics only dying out when they are not viable. This evolution of diversity makes for greater resilience, which is maybe why ‘pragmatic’ Darwinian evolution has evolved.

The products of evolution are generally also pragmatic, in that they have virtually pre-programmed behaviours which ‘unfold’ in the environment. Plants grow and procreate, while animals have a richer variety of behaviours, but still tend just to do what they do. But humans can ‘think for themselves’ and be ‘creative’, and so have the possibility of not being just pragmatic.

I was at a (very good) lecture by Alice Roberts last night on the evolution of technology. She noted that many creatures use tools, but humans seem to be unique in that at some critical population mass the manufacture and use of tools becomes sustained through teaching, copying and co-operation. It occurred to me that much of this could be pragmatic. After all, until recently development has been very slow, and so may well have been driven by specific practical problems rather than continual searching for improvements. Also, the more recent upswing of innovation seems to have been associated with an increased mixing of cultures and decreased intolerance for people who think for themselves.

In biological evolution mutations can lead to innovation, so evolution is not entirely pragmatic, but their impact is normally limited by the need to fit the current niche, so evolution typically appears to be pragmatic. The role of mutations is more to increase the diversity of behaviours within the niche, rather than innovation as such.

In social evolution there will probably always have been mavericks and misfits, but the social pressure has been towards conformity. I conjecture that such an environment has favoured a habit of pragmatism. These days, it seems to me, a better approach would be more open-minded, inclusive and exploratory, but possibly we do have a biologically-conditioned tendency to be overly pragmatic: to confuse conventions for facts and heuristics for laws of nature, and not to challenge widely-held beliefs.

The financial crash of 2008 was blamed by some on mathematics. This seems ridiculous. But the post Cold War world was largely one of growth with the threat of nuclear devastation much diminished, so it might be expected that pragmatism would be favoured. Thus powerful tools (mathematical or otherwise) could be taken up and exploited pragmatically, without enough consideration of the potential dangers. It seems to me that this problem is much broader than economics, but I wonder what the cure is, apart from better education and more enlightened public debate?

In heavy traffic, such as on motorways in rush-hour, there is often oscillation in speed and there can even be mysterious ’emergent’ halts. The use of variable speed limits can result in everyone getting along a given stretch of road quicker.

Soros (worth reading) has written an article that suggests that this is all to do with the humanity and ‘thinking’ of the drivers, and that something similar is the case for economic and financial booms and busts. This might seem to indicate that ‘mathematical models’ were a part of our problems, not solutions. So I suggest the following thought experiment:

Suppose a huge number of identical driverless cars with deterministic control functions all try to go along the same road, seeking to optimise performance in terms of ‘progress’ and fuel economy. Will they necessarily succeed, or might there be some ‘tragedy of the commons’ that can only be resolved by some overall regulation? What are the critical factors? Is the nature of the ‘brains’ one of them?

Are these problems the preserve of psychologists, or does mathematics have anything useful to say?

This page considers the case that the Assad regime used gas against the rebels on 21st August 2013 from a theory of evidence perspective. For a broader account, see Wikipedia.

The JIC Assessment

The JIC concluded on 27th that it was:

highly likely that the Syrian regime was responsible.

In the covering letter (29th) the chair said:

Against that background, the JIC concluded that it is highly likely that the regime was responsible for the CW attacks on 21 August. The JIC had high confidence in all of its assessments except in relation to the regime’s precise motivation for carrying out an attack of this scale at this time – though intelligence may increase our confidence in the future.

A cynic or pedant might note the caveat:

The paper’s key judgements, based on the information and intelligence available to us as of 25 August, are attached.

Mathematically-based analysis

From a mathematical point of view, the JIC report is an ‘utterance’, and one needs to consider the context in which it was produced. Hopefully, best practice would include identifying the key steps in the conclusion and seeking out and hastening any possible contrary reports. Thus one might reasonably suppose that the letter on the 29th reflected all obviously relevant information available up to the ends of the 28th, but perhaps not some other inputs, such as ‘big data’, that only yield intelligence after extensive processing and analysis.

But what is the chain of reasoning (29th)?

It is being claimed, including by the regime, that the attacks were either faked or undertaken by the Syrian Armed Opposition. We have tested this assertion using a wide range of intelligence and open sources, and invited HMG and outside experts to help us establish whether such a thing is possible. There is no credible intelligence or other evidence to substantiate the claims or the possession of CW by the opposition. The JIC has therefore concluded that there are no plausible alternative scenarios to regime responsibility.

…

The JIC had high confidence in all of its assessments except in relation to the regime’s precise motivation for carrying out an attack of this scale at this time – though intelligence may increase our confidence in the future.

The report of the 27th is more nuanced:

There is no credible evidence that any opposition group has used CW. A number continue to seek a CW capability, but none currently has the capability to conduct a CW attack on this scale.

Russia claims to have a ‘good degree of confidence’ that the attack was an ‘opposition provocation’ but has announced that they support an investigation into the incident. …

In contrast, concerning Iraqi WMD, we were told that “lack of evidence is not evidence of lack”. But mathematics is not so rigid: it depends on one’s intelligence sources and analysis. Presumably in 2003 we lacked the means to detect Iraqi CW, but now – having learnt the lesson – we would know almost as soon as any one of a number of disparate groups acquires CW. Many outside the intelligence community might not find this credible, leading to a lack of confidence in the report. Others would take the JIC’s word for it. But while the JIC may have evidence that supports their rating, it seems to me that they have not even alluded to a key part of it.

Often, of course, an argument may be technically flawed but still lead to a correct conclusion. To fix the argument one would want a much greater understanding of the situation. For example, the Russians seem to suggest that one opposition group would be prepared to gas another, presumably to draw the US and others into the war. Is the JIC saying that this is not plausible, or simply that no such group (yet) has the means? Without clarity, it is difficult for an outsider to asses the report and draw their own conclusion.

Finally, it is notable that regime responsibility for the attack of the 21st is rated ‘highly likely’, the same as their responsibility for previous attacks. Yet mathematically the rating should depend on what is called ‘the likelihood’, which one would normally expect to increase with time. Hence one would expect the rating to increase from possible (in the immediate aftermath) through likely to highly likely, as the kind of issues described above are dealt with. This unexpectedly high rating calls for an explanation, which would need to address the most relevant factors.

Anticipating the UN Inspectors

The UN weapons inspectors are expected to produce much relevant evidence. For example, it may be that even if an opposition group had CW an attack would necessarily lack some key signatures. But, from a mathematical point of view, one cannot claim that one explanation is ‘highly likely’ without considering all the alternatives and taking full account of how the evidence was obtained. It is quite true, as the PM argued, that there will always be gaps that require judgement to span. But we should strive to make the gap as slight as possible, and to be clear about what it is. While one would not want a JIC report to be phrased in terms of mathematics, it would seem that appropriate mathematics could be a valuable aid to critical thinking. Hopefully we shall soon have an assessment that genuinely rates ‘highly likely’ independently of any esoteric expertise, whether intelligence or mathematics.

Updates

30th August: US

The US assessment concludes that the attack was by Assad’s troops, using rockets to deliver a nerve agent, following their usual procedures. This ought to be confirmed or disconfirmed by the inspectors, with reasonable confidence. Further, the US claim ‘high confidence’ in their assessment, rather than very high confidence. Overall, the US assessment appears to be about what one would expect if Assad’s troops were responsible.

31st August: Blog

There is a good private-enterprise analysis of the open-source material. It makes a good case that the rockets’ payloads were not very dense, and probably a chemical gas. However, it points out that only the UN inspectors could determine if the payload was a prohibited substance, or some other substance such as is routinely used by respectable armies and police forces.

It makes no attribution of the rockets. The source material is clearly intended to show them being used by the Assad regime, but there is no discussion of whether or not any rebel groups could have made, captured or otherwise acquired them.

2nd September: France

The French have declassified a dossier. Again, it presents assertion and argumentation rather than evidence. The key points seem to be:

A ‘large’ amount of gas was used.

Rockets were probably used (presumably many).

No rebel group has the ability to fire rockets (unlike the Vietcong in Vietnam).

This falls short of a conclusive argument. Nothing seems to rule out the possibility of an anti-Assad outside agency loading up an ISO container (or a mule train) with CW (perhaps in rockets), and delivering them to an opposition group along with an adviser. (Not all the opposition groups all are allies.)

4th September: Germany

Conjecture that the CW mix was stronger than intended, and hence lethal rather than temporarily disabling.

That a Hezbollah official said that Assad had ‘lost his nerve’ and ordered the attack.

It is not clear if the Hezbollah utterance was based on good grounds or was just speculation.

4th September: Experts

Some independent experts have given an analysis of the rockets that is similar in detail to that provided by Colin Powell to the UN in 2003, providing some support for the official dossiers. They asses that each warhead contained 50 litres (13 gallons) of agent. The assess that the rebels could have constructed the rockets, but not produced the large quantity of agents.

No figure is given for the number of rockets, but I have seen a figure of 100, which seems the right order of magnitude. This would imply 5,000 litres or 1,300 gallons, if all held the agent. A large tanker truck has a capacity of about 7 times this, so it does not seem impossible that such an amount could have been smuggled in.

This report essentially puts a little more detail on the blog of 31st August, and is seen as being more authoritative.

5th September: G20

The UK has confirmed that Sarin was used, but seems not to have commented on whether it was of typical ‘military quality’, or more home-made.

Russia has given the UN a 100 page dossier of its own, and I have yet to see a credible debunking (early days, and I haven’t found it on-line).

6th September: Veteran Intelligence Professionals for Sanity

An alternative, unofficial narrative. Can this be shown to be incredible? Will it be countered?

9th September: German

German secret sources indicate that Assad had no involvement in the CW attack (although others in the regime might have).

9th September: FCO news conference

John Kerry, at a UK FCO news conference, gives very convincing account of the evidenced for CW use, but without indicating any evidence that the chemicals were delivered by rocket. He is asked about Assad’s involvement, but notes that all that is claimed is senior regime culpability.

UN Inspectors’ Report

21st September. The long-awaited report concludes that rockets were used to deliver Sarin. The report, at first read, seems professional and credible. It is similar in character to the evidence that Colin Powell presented to the UN in 2003, but without the questionable ‘judgments’. It provides some key details (type of rocket, trajectory) which – one hopes – could be tied to the Assad regime, especially given US claims to have monitored rocket launches. Otherwise, they appear to be of type that the rebels could have used.

The report does not discuss the possibility, raised by the regime, that conventional rockets had accidentally hit a rebel chemical store, but the technical details do seem to rule it out. There is an interesting point here. Psychologically, the fact that the regime raised a possibility in their defence which has been shown to be false increases our scepticism about them. But mathematically, if they are innocent then we would not expect them to know what happened, and hence we would not expect their conjectures to be correct. Such a false conjecture could even be counted as evidence in their favour, particularly if we thought them competent enough to realise that such an invention would easily be falsified by the inspectors.

Reaction

Initial formal reactions

Initial reactions from the US, UK and French are that the technical details, including the trajectory, rule out rebel responsibility. They appear to be a good position to make such a determination, and it would normally be a conclusion that I would take at face value. But given the experience of Iraq and their previous dossiers, it seems quite possible that they would say what they said even without any specific evidence. A typical response, from US ambassador to the UN Samantha Power was:

The technical details of the UN report make clear that only the regime could have carried out this large-scale chemical weapons attack.”

Being just a little pedantic, this statement is literally false: one would at least have to take the technical details to a map showing rebel and regime positions, and have some idea of the range of the rockets. From the Russian comments, it would seem they have not been convinced.

Media reaction

Whether the rebels have captured these delivery systems – along with sarin gas – from government armouries is unknown. Even if they have, experts said that operating these weapons successfully would be exceptionally difficult.

”It’s hard to say with certainty that the rebels don’t have access to these delivery systems. But even if they do, using them in such a way as to ensure that the attack was successful is the bit the rebels won’t know how to do,” said Dina Esfandiary, an expert on chemical weapons at the International Institute for Strategic Studies.

The investigators had enough evidence to trace the trajectories followed by two of the five rockets. If the data they provide is enough to pinpoint the locations from which the weapons were launched, this should help to settle the question of responsibility.

John Kerry, the US secretary of state, says the rockets were fired from areas of Damascus under the regime’s control, a claim that strongly implicates Mr Assad’s forces.

This suggests that there might be a strong case against the regime. But it is not clear that the government would be the only source of weapons for the rebels, that the rebels would need sophisticated launchers (rather than sticks) or that they would lack advice. Next, given the information on type, timing and bearing it should be possible to identify the rockets, if the US was monitoring their trajectories at the time, and hence it might be possible to determine where they came from, in which case the evidence trail would lead strongly to the regime. (Elsewhere it has been asserted that one of the rockets was fired from within the main Syrian Army base, in which case one would have thought they would have noticed a rebel group firing out.)

17 September: Human Rights Watch

Human Rights Watch has marked the UN estimate of the trajectories on a map, clearly showing tha- they could have been fired from the Republican Guard 104 Brigade area.

Connecting the dots provided by these numbers allows us to see for ourselves where the rockets were likely launched from and who was responsible.

…

This isn’t conclusive, given the limited data available to the UN team, but it is highly suggestive and another piece of the puzzle.

This seems a reasonable analysis. The BBC has said of it:

Human Rights Watch says the document reveals details of the attack that strongly
suggest government forces were behind the attack.

But this seems to exaggerate the strength of the evidence. One would at least want to see if the trajectories are consistent with the rockets having been launched from rebel held areas (map, anyone?) It also seems a little odd that a salvo of M14 rockets appear to have been fired over the presidential palace. Was the Syrian Army that desperate? Depending on the view that one takes of these questions, the evidence could favour the rebel hypothesis. On the other hand, if the US could confirm that the only rockets fired at that time to those sites came from government areas, that would seem conclusive.

(Wikipedia gives technical details of rockets. It notes use by the Taliban, and quotes its normal maximum range as 9.8km. The Human Rights Watch analysis seems to be assuming that this will not be significantly reduced by the ad-hoc adaptation to carry gas. Is this credible? My point here is that the lack of explicit discussion of such aspects in the official dossiers leaves room for doubt, which could be dispelled if their ‘very high confidence’ is justified.)

18 September: Syrian “proof”

The BBC has reported that the Syrians have provide what they consider proof to the Russia that the rebels were responsible for the CW attack, and that the Russians are evaluating it. I doubt that this will be proof, but perhaps it will reduce our confidence in the ‘very high’ likelihood that the regime was responsible. (Probably not!) It may, though, flush out more conclusive evidence, either way.

19 September: Forgery?

Assad has claimed that the materials recovered by the UN inspectors were forged. The report talks about rebels moving material, and it is not immediately clear, as the official dossiers claim, that this hypothesis is not credible, particularly if the rebels had technical support.

Putin has confirmed that the rockets used were obsolete Soviet-era ones, no longer in use by the Syrian Army.

December: US Intelligence?

Hersh claims that US had intelligence that the Syrian rebels had chemical weapons, and that the US administration deliberately ‘adjusted’ the intelligence to make it appear much more damning of the Syrian regime. (This is disputed.)

Comment

The UN Inspectors report is clear about what it has found. It is careful not to make deductive leaps, but provides ample material to support further analysis. For example, while it finds that Sarin was delivered by rockets that could have been launched from a regime area, it does not rule out rebel responsibility. But it does give details of type, time and direction, such that if – as appears to be the case from their dossier – the US were monitoring the area, it should be possible to conclude that the rocket was actually fired by the regime. Maybe someone will assemble the pieces for us.

My own view is not that Assad did not do it or that we should not attack, but that any attack based on the grounds that Assad used CW should be supported by clear, specific evidence, which the dossiers prior to the UN report did not provide. Even now, we lack a complete case. Maybe the UN should have its own intelligence capability? Or could we attack on purely humanitarian grounds, not basing the justification on the possible events on 21 Aug? Or share our intelligence with the Russians and Chinese?

A feature of the debate seems to be that those who think that ‘something must be done’ tend to be critical of those who question the various dossiers, and those who object to military action tend to throw mud at the dossiers, justified or not. So maybe my main point should be that, irrespective of the validity of the JIC assessment, we need a much better quality of debate, engaging the public and those countries with different views, not just our traditional allies.

A notable exception was a private blog, which looked very credible, but fell short claiming “high likelihood”. It gives details of two candidate delivery rockets, and hoped that the UN inspectors will have got evidence from them, as they did. Neither rocket was known to have been used, but neither do they appear to be beyond the ability of rebel groups to use (with support). The comments are also interesting, e.g.:

There is compelling evidence that the Saudi terrorists operating in Syria, some having had training from an SAS mercenary working out of Dubai who is reporting back to me, are responsible for the chemical attack in the Ghouta area of Damascus.

The AIPAC derived ‘red line’ little game and frame-up was orchestrated at the highest levels of the American administration and liquid sarin binary precursors mainly DMMP were supplied by Israeli handled Saudi terrorists to a Jabhat al-Nusra Front chemist and fabricator.

Israel received supplies of the controlled substance DMMP from Solkatronic Chemicals of Morrisville, Pa.

This at least has some detail, although not such as can be easily checked.

Finally, I am beginning to get annoyed by the media’s use of scare quotes around Russian “evidence”.

The New Scientist (30 March 2013) has the following question, under the heading ‘Stupid is as stupid does’:

Jack is looking at Anne but Anne is looking at George. Jack is married but George is not. Is a married person looking at an unmarried person?

Possible answers are: “yes”, “no” or “cannot be determined”.

You might want to think about this before scrolling down.

.

.

.

.

.

.

.

It is claimed that while ‘the vast majority’ (presumably including financiers, whose thinking is being criticised) think the answer is “cannot be determined”,

careful deduction shows that the answer is “yes”.

Similar views are expressed at a learning blog and at a Physics blog, although the ‘careful deductions’ are not given. Would you like to think again?

.

.

.

.

.

.

.

.

Now I have a confession to make. My first impression is that the closest of the admissible answers is ‘cannot be determined’, and having thought carefully for a while, I have not changed my mind. Am I stupid? (Based on this evidence!) You might like to think about this before scrolling down.

.

.

.

.

.

.

.

Some people object that the term ‘is married’ may not be well-defined, but that is not my concern. Suppose that one has a definition of marriage that is as complete and precise as possible. What is the correct answer? Does that change your thinking?

.

.

.

.

.

.

.

Okay, here are some candidate answers that I would prefer, if allowed:

There are cases in which the answer cannot be determined.

It is not possible to prove that there are not cases in which the answer cannot be determined. (So that the answer could actually be “yes”, but we cannot know that it is “yes”.)

Either way, it cannot be proved that there is a complete and precise way of determining the answer, but for different reasons. I lean towards the first answer, but am not sure. Which it is is not a logical or mathematical question, but a question about ‘reality’, so one should ask a Physicist. My reasoning follows … .

.

.

.

.

.

.

.

.

Suppose that Anne marries Henry who dies while out in space, with a high relative velocity and acceleration. Then to answer yes we must at least be able to determine a unique time in Anne’s time-frame in which Henry dies, or else (it seems to me) there will be a period of time in which Anne’s status is indeterminate. It is not just that we do not know what Anne’s status is; she has no ‘objective’ status.

If there is some experiment which really proves that there is no possible ‘objective’ time (and I am not sure that there is) then am I not right? Even if there is no such experiment, one cannot determine the truth of physical theories, only fail to disprove them. So either way, am I not right?

Enlightenment, please. The link to finance is that the New Scientist article says that

I agree, but it seems to me (after Keynes) that it was their use of the kind of ‘classical’ logic that is implicitly assumed in the article that is at fault. Being married is a relation, not a proposition about Anne. Anne has no state or attributes from which her marital status can be determined, any more than terms such as crash, recession, money supply, inflation, inequality, value or ‘the will of the people’ have any correspondence in real economies. Unless you know different?

I attended a conference on the mathematics of finance last week. It seems that things would have gone better in 2007/8 if only policy makers had employed some mathematicians to critique the then dominant dogmas. But I am not so sure. I think one would need to understand why people went along with the dogmas. Psychology, such as behavioural economics, doesn’t seem to help much, since although it challenges some aspects of the dogmas it fails to challenge (and perhaps even promotes) other aspects, so that it is not at all clear how it could have helped.

Here I speculate on an answer.

Finance and economics are either empirical subjects or they are quasi-religious, based on dogmas. The problems seem to arise when they are the latter but we mistake them for the former. If they are empirical then they have models whose justification is based on evidence.

Naïve inductivism boils down to the view that whatever has always (never) been the case will continue always (never) to be the case. Logically it is untenable, because one often gets clashes, where two different applications of naïve induction are incompatible. But pragmatically, it is attractive.

According to naïve inductivism we might suppose that if the evidence has always fitted the models, then actions based on the supposition that they will continue to do so will be justified. (Hence, ‘it is rational to act as if the model is true’). But for something as complex as an economy the models are necessarily incomplete, so that one can only say that the evidence fitted the models within the context as it was at the time. Thus all that naïve inductivism could tell you is that ‘it is rational’ to act as if the model is true, unless and until the context should change. But many of the papers at the mathematics of finance conference were pointing out specific cases in which the actions ‘obviously’ changed the context, so that naïve inductivism should not have been applied.

It seems to me that one could take a number of attitudes:

It is always rational to act on naïve inductivism.

It is always rational to act on naïve inductivism, unless there is some clear reason why not.

It is always rational to act on naïve inductivism, as long as one has made a reasonable effort to rule out any contra-indications (e.g., by considering ‘the whole’).

It is only reasonable to act on naïve inductivism when one has ruled out any possible changes to the context, particularly reactions to our actions, by considering an adequate experience base.

In addition, one might regard the models as conditionally valid, and hedge accordingly. (‘Unless and until there is a reaction’.) Current psychology seems to suppose (1) and hence has little to help us understand why people tend to lean too strongly on naïve inductivism. It may be that a belief in (1) is not really psychological, but simply a consequence of education (i.e., cultural).

The recent conviction of six seismologists and a public official for reassuring the public about the risk of an earthquake when there turned out to be one raises many issues, mostly legal, but I want to focus on the scientific aspects, specifically the assessment and communication of uncertainty.

A recent paper by O’Hagan notes that there is “wide recognition that the appropriate representation for expert judgements of uncertainty is as a probability distribution for the unknown quantity of interest …”. This conflicts with UK best practice, as described by Spiegelhalter at understanding uncertainty. My own views have been formed by experience of potential and actual crises where evaluation of uncertainty played a key role.

From a mathematical perspective, probability theory is a well-grounded theory depending on certain axioms. There are plausible arguments that these axioms are often satisfied, but these arguments are empirical and hence should be considered at best as scientific rather than mathematical or ‘universally true’. O’Hagan’s arguments, for example, start from the assumption that uncertainty is nothing but a number, ignoring Spiegelhalter’s ‘Knightian uncertainty‘.

Thus, it seems to me, that where there are rare critical decisions with a lack of evidence to support a belief in the axioms, one should recognize the attendant non-probabilistic uncertainty, and that failure to do so is a serious error, meriting some censure. In practice, one needs relevant guidance such as the UK is developing, interpreted for specific areas such as seismology. This should provide both guidance (such as that at understanding uncertainty) to scientists and material to be used in communicating risk to the public, preferably with some legal status. But what should such guidance be? Spiegelhalter’s is a good start, but needs developing.

My own view is that one should have standard techniques that can put reasonable bounds on probabilities, so that one has something that is relatively well peer-reviewed, ‘authorised’ and ‘scientific’ to inform critical decisions. But in applying any methods one should recognize any assumptions that have been made to support the use of those methods, and highlight them. Thus one may say that according to the usual methods, ‘the probability is p’, but that there are various named factors that lead you to suppose that the ‘true risk’ may be significantly higher (or lower). But is this enough?

Some involved in crisis management have noted that scientists generally seem to underestimate risk. If so, then even the above approach (and the similar approach of understanding uncertainty) could tend to understate risk. So do scientists tend to understate the risks pertaining to crises, and why?

It seems to me that one cannot be definitive about this, since there are, from a statistical perspective – thankfully – very few crises or even near-crises. But my impression is that could be something in it. Why?

As at Aquila, human and organisational factors seem to play a role, so that some answers seem to need more justification that others. Any ‘standard techniques’ would need take account of these tendancies. For example, I have often said that the key to good advice is to have a good customer, who desires an adequate answer – whatever it is – who fully appreciates the dangers of misunderstanding arising, and is prepared to invest the time in ensuring adequate communication. This often requires debate and perhaps role-playing, prior to any crisis. This was not achieved at Aquila. But is even this enough?

Here I speculate even more. In my own work, it seems to me that where a quantity such as P(A|B) is required and scientists/statisticians only have a good estimate of P(A|B’) for some B’ that is more general than B, then P(A|B’) will be taken as ‘the scientific’ estimate for P(A|B). This is so common that it seems to be a ‘rule of pragmatic inference’, albeit one that seems to be unsupported by the kind of arguments that O’Hagan supports. My own experience is that it can seriously underestimate P(A|B).

The facts of the Aquila case are not clear to me, but I suppose that the scientists made their assessment based on the best available scientific data. To put it another way, they would not have taken account of ad-hoc observations, such as amateur observations of radon gas fluctuations. Part of the Aquila problem seems to be that the amateur observations provided a warning which the population were led to discount on the basis of ‘scientific’ analysis. More generally, in a crisis, one often has a conflict between a scientific analysis based on sound data and non-scientific views verging on divination. How should these diverse views inform the overall assessment?

In most cases one can make a reasonable scientific analysis based on sound data and ‘authorised assumptions’, taking account of recognized factors. I think that one should always strive to do so, and to communicate the results. But if that is all that one does then one is inevitably ignoring the particulars of the case, which may substantially increase the risk. One may also want to take a broader decision-theoretic view. For example, if the peaks in radon gas levels were unusual then taking them as a portent might be prudent, even in the absence of any relevant theory. The only reason for not doing so would be if the underlying mechanisms were well understood and the gas levels were known to be simply consequent on the scientific data, thus providing no additional information. Such an approach is particularly indicated where – as I think is the case in seismology – even the best scientific analysis has a poor track record.

The bottom line, then, is that I think that one should always provide ‘the best scientific analysis’ in the sense of an analysis that gives a numeric probability (or probability range etc) but one needs to establish a best practice that takes a broader view of the issue in question, and in particular the limitations and potential biases of ‘best practice’.

The O’Hagan paper quoted at the start says – of conventional probability theory – that “Alternative, but similarly compelling, axiomatic or rational arguments do not appear to have been advanced for other ways of representing uncertainty.” This overlooks Boole, Keynes , Russell and Good, for example. It may be timely to reconsider the adequacy of the conventional assumptions. It might also be that ‘best scientific practice’ needs to be adapted to cope with messy real-world situations. Aquila was not a laboratory.

Actually, the judge was a bit more considered than my title suggests. In my defence the Guardian says:

“Bayes’ theorem is a mathematical equation used in court cases to analyse statistical evidence. But a judge has ruled it can no longer be used. Will it result in more miscarriages of justice?”

The case involved Nike trainers and appears to be the same as that in a recent appeal judgment, although it doesn’t actually involve Bayes’ rule. It just involves the likelihood ratio, not any priors. An expert witness had said:

“… there is at this stage a moderate degree of scientific evidence to support the view that the [appellant’s shoes] had made the footwear marks.”

The appeal hinged around the question of whether this was a reasonable representation of a reasonable inference.

According to Keynes, Knight and Ellsberg, probabilities are grounded on either logic, statistics or estimates. Prior probabilities are – by definition – never grounded on statistics and in practical applications rarely grounded on logic, and hence must be estimates. Estimates are always open to challenge, and might reasonably be discounted, particularly where one wants to be ‘beyond reasonable doubt’.

Likelihood ratios are typically more objective and hence more reliable. In this case they might have been based on good quality relevant statistics, in which case the judge supposed that it might be reasonable to state that there was a moderate degree of scientific evidence. But this was not the case. Expert estimates had supplied what the available database had lacked, so introducing additional uncertainty. This might have been reasonable, but the estimate appears not to have been based on relevant experience.

My deduction from this is that where there is doubt about the proper figures to use, that doubt should be acknowledged and the defendant given the benefit of it. As the judge says:

“… it is difficult to see how an opinion … arrived at through the application of a formula could be described as ‘logical’ or ‘balanced’ or ‘robust’, when the data are as uncertain as we have set out and could produce such different results.”

This case would seem to have wider implications:

“… we do not consider that the word ‘scientific’ should be used, as … it is likely to give an impression … of a degree of precision and objectivity that is not present given the current state of this area of expertise.”

My experience is that such estimates are often used by scientists, and the result confounded with ‘science’. I have sometimes heard this practice justified on the grounds that some ‘measure’ of probability is needed and that if an estimate is needed it is best that it should be given by an independent scientist or analyst than by an advocate or, say, politician. Maybe so, but perhaps we should indicate when this has happened, and the impact it has on the result. (It might be better to follow the advice of Keynes.)

Royal Statistical Society

“There is a long history and ample recent experience of misunderstandings relating to statistical information and probabilities which have contributed towards serious miscarriages of justice. … forensic scientists and expert witnesses, whose evidence is typically the immediate source of statistics and probabilities presented in court, may also lack familiarity with relevant terminology, concepts and methods.”

“Guide No 1 is designed as a general introduction to the role of probability and statistics in criminal proceedings, a kind of vade mecum for the perplexed forensic traveller; or possibly, ‘Everything you ever wanted to know about probability in criminal litigation but were too afraid to ask’. It explains basic terminology and concepts, illustrates various forensic applications of probability, and draws attention to common reasoning errors (‘traps for the unwary’).”

The guide is clearly much needed. It states:

“The best measure of uncertainty is probability, which measures uncertainty on a scale from 0 to 1.”

This statement is nowhere supported by any evidence whatsoever. No consideration is given to alternatives, such as those of Keynes, or to the legal concept of “beyond reasonable doubt.”

“The type of probability that arises in criminal proceedings is overwhelmingly of the subjective variety, …

There is no consideration of Boole and Keynes’ more logical notion, or any reason to take notice of the subjective opinions of others.

But how do we determine whether those axioms are appropriate to the situation at hand? The reader is not told whether the term axiom is to be interpreted in its mathematical or lay sense: as something to be proved, or as something that may be assumed without further thought. The first example given is:

“Consider an unbiased coin, with an equal probability of producing a ‘head’ or a ‘tail’ on each coin-toss. …”

Probability here is mathematical. Considering the probability of an untested coin of unknown provenance would be more subjective. It is the handling of the subjective component that is at issue, an issue that the example does not help to address. More realistically:

“Assessing the adequacy of an inference is never a purely statistical matter in the final analysis, because the adequacy of an inference is relative to its purpose and what is at stake in any particular context in relying on it.”

“… an expert report might contain statements resembling the following:
* “Footwear with the pattern and size of the sole of the defendant’s shoe occurred in approximately 2% of burglaries.” …
It is vital for judges, lawyers and forensic scientists to be able to identify and evaluate the assumptions which lie behind these kinds of statistics.”

This is good advice, which the appeal judge took. However, while I have not read and understood every detail of the guidance, it seems to me that the judge’s understanding went beyond the guidance, including its ‘traps for the unwary’.

The statistical guidance cites the following guidance from the forensic scientists’ professional body:

“Logic: The expert will address the probability of the evidence given the proposition and relevant background information and not the probability of the proposition given the evidence and background information.”

This seems sound, but needs supporting by detailed advice. In particular none of the above guidance explicitly takes account of the notion of ‘beyond reasonable doubt’.

Forensic science view

“Our concern is that the judgment will be interpreted as being in opposition to the principles of logical interpretation of evidence. We re-iterate those principles and then discuss several extracts from the judgment that may be potentially harmful to the future of forensic science.”

The full article is behind a pay-wall, but I would like to know what principles it is referring to. It is hard to see how there could be a conflict, unless there are some extra principles not in the RSS guidance.

Criminal law Review

“The strict ratio of R. v T is that existing data are legally insufficient to permit footwear mark experts to utilise probabilistic methods involving likelihood ratios when writing reports or testifying at trial. For the reasons identified in this article, we hope that the Court of Appeal will reconsider this ruling at the earliest opportunity. In the meantime, we are concerned that some of the Court’s more general statements could frustrate the jury’s understanding of forensic science evidence, and even risk miscarriages of justice, if extrapolated to other contexts and forms of expertise. There is no reason in law why errant obiter dicta should be permitted to corrupt best scientific practice.”

In this account it is clear that the substantive issues are about likelihoods rather than probabilities, and that consideration of ‘prior probabilities’ are not relevant here. This is different from the Royal Society’s account, which emphasises subjective probability. However, in considering the likelihood of the evidence conditioned on the suspect’s innocence, it is implicitly assumed that the perpetrator is typical of the UK population as a whole, or of people at UK crime scenes as a whole. But suppose that women are most often murdered by men that they are or have been close to, and that such men are likely to be more similar to each other than people randomly selected from the population as a whole. Then it is reasonable to suppose that the likelihood that the perpetrator is some other male known to the victim will be significantly greater than the likelihood of it being some random man. The use of an inappropriate likelihood introduces a bias.

My advice: do not get involved with people who mostly get involved with people like you, unless you trust them all.

The Appeal

Prof. Jamieson, an expert on the evaluation of evidence whose statements informed the appeal, said:

“It is essential for the population data for these shoes be applicable to the population potentially present at the scene. Regional, time, and cultural differences all affect the frequency of particular footwear in a relevant population. That data was simply not … . If the shoes were more common in such a population then the probative value is lessened. The converse is also true, but we do not know which is the accurate position.”

Thus the professor is arguing that the estimated likelihood could be too high or too low, and that the defence ought to be given the benefit of the doubt. I have argued that using a whole population likelihood is likely to be actually biased against the defence, as I expect such traits as the choice of shoes to be clustered.

Science and Justice

This argues against an unthinking application of likelihood ratios, noting:

That the defence may reasonable not be able explain the evidence, so that there may be no reliable source for an innocent hypothesis.

That assessment of likelihoods will depend on experience, the basis for which should be disclosed and open to challenge.

If there is doubt as to how to handle uncertainty, any method ought to be tested in court and not dictated by armchair experts.

On the other hand, when it says “Accepting that probability theory provides a coherent foundation …” it fails to note that coherence is beside the point: is it credible?

Comment

The current situation seems unsatisfactory, with the best available advice both too simplistic and not simple enough. In similar situations I have co-authored a large document which has then been split into two: guidance for practitioners and justification. It may not be possible to give comprehensive guidance for practitioners, in which case one should aim to give ‘safe’ advice, so that practitioners are clear about when they can use their own judgment and when they should seek advice. This inevitably becomes a ‘legal’ document, but that seems unavoidable.

In my view it should not be simply assumed that the appropriate representation of uncertainty is ‘nothing but a number’. Instead one should take Keynes’ concerns seriously in the guidance and explicitly argue for a simpler approach avoiding ‘reasonable doubt’, where appropriate. I would also suggest that any proposed principles ought to be compared with past cases, particularly those which have turned out to be miscarriages of justice. As the appeal judge did, this might usefully consider foreign cases to build up an adequate ‘database’.

My expectation is that this would show that the use of whole-population likelihoods as in R v T is biased against defendants who are in a suspect social group.

More generally, I think that anyguidance ought to apply to my growing uncertainty puzzles, even if it only cautions against a simplistic application of any rule in such cases.