* Explaining SLA

In a previous post about linguistic knowledge, I mentioned Gregg’s opinion, which is that a satisfactory theory of SLA must describe the knowledge involved in the various competencies finally attained by learners (a property theory), and also explain how learners acquire them (a transition theory).

I tried to argue against Gregg’s position (Jordan, 2004), saying that I didn’t think a property theory was necessary. Part of the article dealt with explanation. Here’s that part:

Theories often appeal to things that cannot be observed in order to explain things that we do observe. These unobserved things include actual entities, like Neptune (claimed to exist in order to explain the movements of Uranus before it was eventually observed), and electrons (which have never been observed), and forces, such as gravity (which defy observation). Closer to home, nobody has ever observed a parameter being set. If there are rival theories of non-observable phenomena, then we often use the “inference to the best explanation” argument (explanatory power is a reason for belief) to help decide among them. Gregg (1993) observes that logically we cannot allow this inference (the phenomena, if not observable, are only inferred by virtue of the explanation they are part of), but does not explain why scientists still use it. The reason is that it is unlikely that a particular theoretical account can fit the facts so well by pure coincidence. As Hacking says: “it would be an absolute miracle if for example the photo-electric effect went on working while there were no photons. The explanation of the persistence of this phenomenon – the one by which television information is converted from pictures into electrical impulses to be turned into electromagnetic waves in turn picked up by the home receiver – is that photons do exist”. (Hacking, 1983:54)

Moreover, many postulated entities for which there was initially no observable evidence have since been observed: Neptune, microbes, genes, and molecules among them. The inference to the best explanation is thus pragmatically judged as a useful tool.

Gregg’s account of deductive explanations is confusing, as highlighted by his unfruitful discussion of Hempel’s covering laws (Gregg 1993) Let me try to clarify. Suppose I park my motorbike and come back to find it has a buckled front wheel. How did it happen? My friend confesses that he borrowed the bike and hit a curb at speed. Well, that explains it! The event C (buckled wheel) was preceded by event A (my friend borrowed my bike) and event B (he hit a curb at speed). (2) This explanation rests in turn on a causal law that hard objects will damage softer ones on impact (the explanans). Deductive explanations take the form of adducing a general law or laws, some set of initial conditions, and deducing from these the statement describing the event to be explained (the explanandum). But deductive explanations, while logically impeccable, are not always available; often we have no general laws, we make unwarranted inferences, we use inductive arguments, we ignore anomalies, and so on.

Many explanations begin with low level theories: generalisations such as that gases expand when heated. Next we may make some general statement about the relationship between two observable events, for example, Boyle’s law, which showed that reducing the volume of gas to one half doubled its pressure. This leads to the empirical law that the pressure of a gas is inversely proportional to the volume (Asimov, 1975). An empirical law generalises about a kind of event, not about a particular experiment, and is applicable to different events – other types of gases and/or cylinders. But we still do not know why decreased volume causes increased pressure; for this we need a theory of gases.

The kinetic theory of gases (Asimov, 1975) sees gas as a collection of moving molecules which collide with each other and with the walls of any container. Newtonian mechanics described molecular motion and allowed scientists to calculate the pressure on the walls of a cylinder by determining how many molecules are colliding with the walls at each instant, and the strength of each collision. So pressure rises when surface area is reduced because there are more collisions, and, by extension, gases expand when heated because heat causes more violent motion of the gas particles. This general theory has wider application, answers more questions (changes in pressure and expansions in volume have common underlying causes) and provides a more complete picture.

Boyle did not simply happen upon a J tube with mercury in it, and observe that the trapped air in the closed end on the short side of the J shrank as he poured in more mercury; he started with a problem – the density of air – and he designed his experiment to refute the paradigm theory that the atmosphere was evenly dense all the way up. Boyle offered no property theory, or any transition theory either. Neither physics nor SLA needs to distinguish property theories from transition theories, nor do they need one to make sense of the other. An explanation is an answer to a problem. The problem sets the scene, explanations take a myriad of forms, and the rational assessment of explanations is what counts.

………………..
Gregg replied:

Jordan finds my discussion of the Deductive-Nomological model in Gregg (1993: 540–1) ‘confusing’ and ‘unfruitful’, although he doesn’t give any example of such confusion or sterility. In any case, I find it hard to think thathis motorbike anecdote could be an improvement. Most importantly, he fails to note that the model has been generally discarded, for reasons that I indicated. One major reason that it has been discarded is that scientific inferences typically are not deductive. A valid deduction entails its conclusion, which is to say, it guarantees the truth of the conclusion. Notoriously, empirical inferences do not; that, of course, is why, whereas any rational
person will reject an invalid deduction, scientists do not automatically abandon a hypothesis when confronted with falsifying data.

Scientific inference is inference to the best explanation, which is not any form of deduction. (As I note elsewhere (Gregg 1993), forcing it into the form of a deduction results in affirming the consequent.) Jordan allows that ‘we often use’ inference to the best explanation (p. 540); ‘always’ would be nearer the mark. Nor is he quite right as to why scientists use it. Jordan cites Hacking’s (1983: 54) argument that it would be a miracle if, say, the photo-electric effect existed but photons did not. This is a compelling argument, but not an argument for why scientists use inference to the best explanation. For one thing, of course, Hacking’s argument itself is an inference to the best explanation, which would make it circular if it were a justification for using inference to the best explanation. Hacking is arguing for the reality of (some) unobservable entities; here, photons. The reason for using inference to the best explanation, on the other hand, is, as Jordan suggests, that it is ‘a useful tool’ (p. 540). Actually, ‘useful tool’ doesn’t come close; inference to the best explanation is what scientists do when they infer from data to explanations, including, by the way, the explanation of why Jordan’s motorbike has a buckled wheel.

……………..

Gregg’s right. Inference to the best explantion is, indeed, the way science tends to couch its arguments. While I observed that this methodology is useful, and very often fruitful, I implied later that I had grave reservations about it. Gregg points out that my reservations fly in the face of facts about how scientists build theories, and points out, again rightly, that my own explanation of the buckled wheel is an example of the methodology I had intended to question. The problem is, nevertheless, that it is logically indefensible, because it is a form of inductive reasoning, and thus succumbs to Hume’s devastating criticism of induction, which states that you cannot logically go from the particular to the general. No matter how many white swans you observe, you can’t conclude that all swans are white. This prompted Popper to propose his “Conjectures and Refutations” view of theory construction.

As the entry on abduction in the Stanford Encyclopedia of Philosophy makes clear, the distinction between deduction and induction and inference to the best explanation (or abduction as it’s also known) is the distinction between necessary and non-necessary inferences. In deductive inferences, what is inferred is necessarily true if the premises from which it is inferred are true; that is, the truth of the premises guarantees the truth of the conclusion. A familiar type of example is inferences instantiating the schema

All As are Bs.
a is an A.
Hence, a is a B.

But consider the inference of “John speaks English” from ”John lives in Plymouth” and “97% of people living in Plymouth speak English”. Here, the truth of the first sentence is not guaranteed (but only made likely) by the joint truth of the second and third sentences. The truth of the premises doesn’t guarantee the truth of the conclusion, and the fact that the inference is based on statistical data is not enough to classify it as an explanation. In contrast, if while on a study trip to Africa, you observe many gray elephants and no non-gray ones, and infer from this that all elephants are gray, because that would provide the best explanation for why you have observed so many gray elephants and no non-gray ones, this would be an instance of an inference to the best explanation. In the latter case there is an implicit or explicit appeal to explanatory considerations, whereas in induction there is not; in induction, there is only an appeal to observed frequencies or statistics.

Inference to the best explanation can be stated thus:

Given evidence E and candidate explanations H1,…, Hn of E, infer the truth of that Hi which best explains E.

But there are (at least) two fundamental problems here: we have to presuppose the notions of candidate explanation and best explanation, both of which are difficult.

1. How many candidates do we allow? How do we know that we judging all “good” candidates? How can we give any assurance that the best explanation is among the candidate explanations we consider?

2. What are the criteria (the so-called theoretical virtues), like simplicity, generality, and coherence, by which we judge the best explanation? This second objection has led to adaptions of the inference to the best explanations. First:

Needless to say, this needs supplementing by a criterion for the satisfactoriness of explanations, or their being good enough, which, however, we are still lacking.

A second variation sanctions, given a comparative premise, only a comparative conclusion:

Given evidence E and candidate explanations H1,…, Hn of E, if Hi explains E better than any of the other hypotheses, infer that Hi is closer to the truth than any of the other hypotheses.

Clearly, this requires an account of closeness to the truth, which we are also lacking.

So, as the Stanford Encyclopedia of Philosophy says, “Even if it is true that we routinely rely on abductive reasoning, it may still be asked whether this practice is rational. For instance, experimental studies have shown that when people are able to think of an explanation for some possible event, they tend to overestimate the likelihood that this event will actually occur…. More telling still, Tania Lombrozo (2007) shows that, in some situations, people tend to grossly overrate the probability of simpler explanations compared to more complicated ones. Although these studies are not directly concerned with abduction in any of the forms discussed so far, they nevertheless suggest that taking into account explanatory considerations in one’s reasoning may not always be for the better”.

Most research in SlA does not attempt a causal explanation, but if we are to have a satisfactory theory of SLA, a causal explanation must be provided. Such an explanation seems to depend on defending the validity of inference to the best explanation, and nobody has done that yet. Which is why, despite Gregg’s demolition of my 2004 paper, I still prefer to see explanation in terms of universally appicable hypotheses or theories which are tested deductively and judged by the criteria I suggest in the page on this website “Guidelines for the construction of a theory of SLA”.
Gregg, K. (2005) `A response to Jordan’s (2004) ‘Explanatory Adequacy and Theories of Second Language Acquisition’. Applied Linguistics 26/1: 121 – 124.