Tuesday, December 29, 2015

Here is a fascinating data visualization experiment by moovel lab testing a piece of ancient wisdom, "All roads lead to Rome" (link). The experiment is discussed in the CityLab blog of the Atlantic. It is not a full map of the auto routes of Europe; instead, it is a construction of the routes that exist from every grid point on the map of Europe to the destination of Rome. So properly speaking, it doesn't confirm that "all roads lead to Rome"; instead it demonstrates that "you can get to Rome from virtually every point in Europe through a dense system of tributaries". It's an amazing representation of the capillaries of transportation throughout the continent.

Imagine what the system would look like if the destination were Stockholm instead. I imagine that the coverage of the map would be equally complete; "you can get to Stockholm from every point in Europe through a dense system of tributaries". But I also imagine that there would be some important structural differences in the two maps, with a different set of most-travelled primary capillaries.

What about it, moovel lab folks -- is this an experiment that could be readily performed?

Here is a Google map of the Roman Empire prepared by the Pelagios Project demonstrating a much more reduced system of roads (link):

It appears visually that it is possible to align the two maps. Major roads in ancient Europe seem to follow the same course today.

It has sometimes been observed that, for the Romans, it might not have been such a good thing that all roads lead to Rome. This same system of roads served as conduits of invasion by waves of Germanic armies.

Here is a video by Mary Beard on the historical importance of the Roman road system.

Monday, December 28, 2015

A short recent article in the Journal of Artificial Societies and Social Simulation by Venturini, Jensen, and Latour lays out a critique of the explanatory strategy associated with agent-based modeling of complex social phenomena (link). (Thanks to Mark Carrigan for the reference via Twitter; @mark_carrigan.) Tommaso Venturini is an expert on digital media networks at Sciences Po (link), Pablo Jensen is a physicist who works on social simulations, and Bruno Latour is -- Bruno Latour. Readers who recall recent posts here on the strengths and weaknesses of ABM models as a basis for explaining social conflict will find the article interesting (link). VJ&L argue that agent-based models -- really, all simulations that proceed from the micro to the macro -- are both flawed and unnecessary. They are flawed because they unavoidable resort to assumptions about agents and their environments that reduce the complexity of social interaction to an unacceptable denominator; and they are unnecessary because it is now possible to trace directly the kinds of processes of social interaction that simulations are designed to model. The "big data" available concerning individual-to-individual interactions permits direct observation of most large social processes, they appear to hold.

Here are the key criticisms of ABM methodology that the authors advance:

Most of them, however, partake of the same conceptual approach in which individuals are taken as discrete and interchangeable 'social atoms' (Buchanan 2007) out of which social structures emerge as macroscopic characteristics (viscosity, solidity...) emerge from atomic interactions in statistical physics (Bandini et al. 2009). (1.2)

most simulations work only at the price of simplifying the properties of micro-agents, the rules of interaction and the nature of macro-structures so that they conveniently fit each other. (1.4)

micro-macro models assume by construction that agents at the local level are incapable to understand and control the phenomena at the global level. (1.5)

And here is their key claim:

Empirical studies show that, contrarily to what most social simulations assume, collective action does not originate at the micro level of individual atoms and does not end up in a macro level of stable structures. Instead, actions distribute in intricate and heterogeneous networks than fold and deploy creating differences but not discontinuities. (1.11)

These criticisms parallel some of my own misgivings about simulation models, though I am somewhat more sympathetic to their use than VJ&L. Here are some of the concerns raised in earlier posts about the validity of various ABM approaches to social conflict (link, link):

Simulations often produce results that appear to be artifacts rather than genuine social tendencies.

Simulations leave out important features of the social world that are prima facie important to outcomes: for example, quality of leadership, quality and intensity of organization, content of appeals, differential pathways of appeals, and variety of political psychologies across agents.

The factor of the influence of organizations is particularly important and non-local.

Simulations need to incorporate actors at a range of levels, from individual to club to organization.

And here is the conclusion I drew in that post:

But it is very important to recognize the limitations of these models as predictors of outcomes in specific periods and locations of unrest. These simulation models probably don't shed much light on particular episodes of contention in Egypt or Tunisia during the Arab Spring. The "qualitative" theories of contention that have been developed probably shed more light on the dynamics of contention than the simulations do at this point in their development.

But the confidence expressed by VJ&L in the new observability of social processes through digital tracing seems excessive to me. They offer a few good examples that support their case -- opinion change, for example (1.9). Here they argue that it is possible to map or track opinion change directly through digital footprints of interaction (Twitter, Facebook, blogging), and this is superior to abstract modeling of opinion change through social networks. No doubt we can learn something important about the dynamics of opinion change through this means.

But this is a very special case. Can we similarly "map" the spread of new political ideas and slogans during the Arab Spring? No, because the vast majority of those present in Tahrir Square were not tweeting and texting their experiences. Can we map the spread of anti-Muslim attitudes in Gujarat in 2002 leading to massive killings of Muslims in a short period of time? No, for the same reason: activists and nationalist gangs did not do us the historical courtesy of posting their thought processes in their Twitter feeds either. Can we study the institutional realities of the fiscal system of the Indonesian state through its digital traces? No. Can we study the prevalence and causes of official corruption in China through digital traces? Again, no.

In other words, there is a huge methodological problem with the idea of digital traceability, deriving from the fact that most social activity leaves no digital traces. There are problem areas where the traces are more accessible and more indicative of the underlying social processes; but this is a far cry from the utopia of total social legibility that appears to underlie the viewpoint expressed here.

So I'm not persuaded that the tools of digital tracing provide the full alternative to social simulation that these authors assert. And this implies that social simulation tools remain an important component of the social scientist's toolbox.

In this book I explore the possibility that this [classical physics] foundational assumption of social science is a mistake, by re-reading social science "through the quantum." More specifically, I argue that human beings and therefore social life exhibit quantum coherence -- in effect, that we are walking wave functions. (3)

A keystone to Wendt's argument is what he regards as the credibility and predictive niceness of "quantum decision theory". The foundational text in this field is Busemeyer and Bruza, Quantum Models of Cognition and Decision. Busemeyer and Bruza argue here, and elsewhere, that the mathematics and concepts of quantum mechanics in physics have seemingly relevant application to the field of cognition and judgment as well. For example, the idea of "wave function collapse" appears to have analogy with the resolution of uncertainty onto decision by a human cognitive agent. Busemeyer and Bruza offer six fundamental analogies between quantum mechanics and cognition:

judgments are based on indefinite states

judgments create rather than record

judgments disturb each other, introducing uncertainty

judgments do not always obey classic logic

judgments do not obey the principles of unicity

cognitive phenomena may not be decomposable

For these and related reasons Busemeyer and Bruza argue that the mathematics, logic, and concepts of quantum mechanics may allow us to reach better traction with respect to the processes of belief acquisition and judgment that constitute human cognition. So far so good -- there may be a mathematical homology between quantum states in the micro-physical world and states of knowledge acquisition at the level of acquisition.

However, Busemeyer and Bruza are entirely explicit in saying that they regard this solely as a formal analogy -- not a hypothesis about the real underlying structure of human thought. They explicitly deny that they find evidence to support the idea that consciousness is a quantum phenomenon at the sub-molecular level. They are "agnostic toward the so-called 'quantum mind' hypothesis" (kl 156). Their use of the mathematics of quantum mechanics is formal rather than substantive -- more akin to using the mathematics of fluid dynamics to represent flow through a social network than arriving at a theory of the real constitution of a domain as a basis for explaining its characteristics.

This book is not about quantum physics per se, but instead it explores the application of the probabilistic dynamic system created by quantum theory to a new domain – the field of cognition and decision making. (kl 245)

So the application is heuristic rather than realistic:

We motivate the use of quantum models as innovative abstractions of existing problems. That is all. These abstractions have the character of idealizations in the sense there is no claim as to the validity of the idealization “on the ground.” (kl 171)

Instead [our theory] turns to quantum theory as a fresh conceptual framework for explaining empirical puzzles, as well as a rich new source of alternative formal tools. To convey the idea that researchers in this area are not doing quantum mechanics, various modifiers have been proposed to describe this work, such as quantum-like models of cognition, cognitive models based on quantum structure, or generalized quantum models. (kl 156)

Given the key role this body of research plays in Wendt's arguments about the social sciences, it is worth considering how it has been received in the relevant academic communities. H. Van Dyke Parunak reviews the work in Computing Reviews (link). Parunak emphasizes the point made here, that the book is explicit in declaring that it does not provide support for the idea of "quantum cognition" as a manifestation of underlying quantum physical processes. He observes that "a more accurate title, but much less exciting, would be Hilbert space models of cognition and decision," emphasizing the purely formal and mathematical nature of their arguments. Quantum mechanics provides a computational model for cognition based on quantum probability theory in their work, not an ontology of the cognitive process. Here is a short piece by Trueblood, Pothos, and Busemeyer in Frontiers in Psychology that spells out the mathematical assumptions that are invoked here (link).

What is perhaps less known is that the ingenious physicists who developed quantum mechanics also invented a new theory of probability, since classical probability (CP) theory was inconsistent with their bold new theory of the physical world. QP theory refers to the rules for assigning probabilities to events from quantum mechanics, without the physics. QP theory is potentially applicable to any area where there is a need to compute probabilities. ("Quantum probability theory as a common framework for reasoning and similarity")

Here is a review article that proposes a series of tests of "quantum-like" models of judgment (link). Here is how the authors describe the field of quantum-like models of cognition:

Recently, a research field that rely on so-called “quantum” or “quantum-like” models has developed to account for such behaviors. The qualifier “quantum” is used to indicate that the models exploit the mathematics of a contemporary physical theory, quantum mechanics. Note that only some mathematical tools of quantum mechanics are employed, and that the claim is not that these models are justified by an application of quantum physics to the brain. For that reason, we shall prefer to call them “quantum-like” models. Such models put into question two classical characteristics recalled above: they abandon Bayesian probabilities for others which are similar to probabilities in quantum mechanics, and they allow for preferences or attitudes to be undetermined. Quantum-like models have received much interest from psychologists, physicists, economists, cognitive scientists and philosophers. For example, new theoretical frameworks have been proposed in decision theory and bounded rationality (Danilov and Lambert-Mogiliansky 2008 and 2010, Yukalov and Sornette 2011). (2)

This description too emphasizes the purely formal nature of this theory; it is an attempt to apply some of the mathematical models and constructs of quantum theory to the empirical problems of cognition and judgment. They go beyond this observation, however, by attempting to assess the ability of the mathematics to fit the data. Their overall judgment is dubious about the applicability of these mathematical tools to the available data on specific aspects of belief formation (22). "After performing the test against available data, the result is quite clear: non-degenerate models are not an option, being not empirically adequate or not needed."

This is all relevant to a discussion of Wendt's work, because Wendt's premise is solidly realist: he wants to seriously consider the possibility or likelihood of "quantum consciousness". This is the idea that thought and mental activity are the manifestations of subatomic quantum effects.

Quantum brain theory takes known effects at the sub-atomic level and scales them upward to the macroscopic level of the brain. (31)

Hence the central question(s) of this book: (a) how might a quantum theoretic approach explain consciousness and by extension intentional phenomena, and thereby unify physical and social ontology, and (b) what are some implications of the result for contemporary debates in social theory? (29)

For the price of the two claims of quantum consciousness theory –that the brain is a quantum computer and that consciousness inheres in matter at the fundamental level –we get solutions to a host of intractable problems that have dogged the social sciences from the beginning. These claims are admittedly speculative, but neither is precluded by what we currently know about the brain or quantum physics, and given the classical materialist failure to make progress on the mind–body problem, at this point they look no more speculative than the orthodoxy –and the potential pay-off is huge. (35)

These are tantalizing ideas. It is clear that they are intended as substantive, not merely formal or mathematical. We are asked to take seriously, as an empirical hypothesis, the idea that the brain is a quantum machine and its gross behavior (memory, belief, judgment) is substantively influenced by that quantum substrate. But it is fundamentally unclear whether the findings of Busemeyer and Bruza or other practitioners of quantum probability in the field of cognition provide any support at all for the substantive quantum-consciousness hypothesis.

Friday, December 18, 2015

After World War II John von Neumann became interested in the central nervous system as a computing organ. Ironically, more was probably known about neuroanatomy than about advanced digital computing in the 1940s; that situation has reversed, of course. Now we know a great deal about calculating, recognizing, searching, and estimating in silicon; but relatively less about how these kinds of processes work in the setting of the central nervous system. At the time of his final illness von Neumann was preparing a series of Silliman Lectures at Yale University that focused on the parallels that exist between the digital computer and the brain; these were published posthumously as The Computer and the Brain (CB) in 1958. This topic also comes in for substantial discussion in Theory Of Self Reproducing Automata (TSRA) (edited and published posthumously by Arthur Burks in 1966). It is very interesting to see how von Neumann sought to analyze this problem on the basis of the kinds of information available to him in the 1950s.

Much of CB takes the form of a rapid summary of the state of knowledge about digital computing machines that existed in the 1950s, from Turing to ENIAC. Almost all computers today possess the "von Neumann" architecture along these lines.

Alan Turing provided some of the mathematical and logical foundations of modern digital computing (link). He hypothesized a very simple computing device that consisted of a tape of indefinite length, a tape drive mechanism that permitted moving the tape forwards or backwards one space, and a read-write mechanism that could read the mark in a tape location or erase and re-write the mark in that location. Here is a diagram of a Turing machine:

(Fascinatingly, here is a photo of a working model of a Turing machine (link):)

Turing's fundamental theorem is that any function that is computable at all is computable on a Turing machine; so a Turing machine is a universal computing machine. The von Neumann architecture and the computing machines that it spawned -- ENIAC and its heirs -- are implementations of a universal computing machine.

From the time of Frege it has been understood that mathematical operations can be built up as compounds of several primitive operations -- addition, subtraction, etc.; so, for example, multiplication can be defined in terms of a sequence of additions. Programming languages and libraries of subroutines take advantage of this basic logic: new functions are defined as series of more elementary operations embodied in machine states. As von Neumann puts the point in CB:

More specifically: any computing machine that is to solve a complex mathematical problem must be “programmed” for this task. This means that the complex operation of solving that problem must be replaced by a combination of the basic operations of the machine. Frequently it means something even more subtle: approximation of that operation—to any desired (prescribed) degree—by such combinations. (5)

Key questions about the capacities of a computing machine, either electro-mechanical or biological, have to do with estimating its dimensionality: how much space does it occupy, how much energy does it consume, and how much time does it take to complete a given calculation? And this is where von Neumann's analysis took its origin. Von Neumann sought to arrive at realistic estimates of the size and functionality of the components of these two kinds of computation machines. The differences in scale are enormous, whether we consider speed, volume, or energy consumption. Fundamentally, neurons are more numerous by orders of magnitude (10^10 versus 10^4); slower by orders of magnitude (5 msec vs. 10^-3 msec); less energy-intensive by orders of magnitude (10^-3 ergs vs.10^2 ergs); and computationally less precise by orders of magnitude. (Essentially he estimates that a neural circuit, either analog or digital, is capable of precision of only about 1%.) And yet von Neumann concludes that brains accomplish computational problems faster than digital computers because of their massively parallel structure -- in spite of the comparative slowness of the individual elements of computation (neurons). This implies that the brain embodies a structurally different architecture than sequential digital computing embodied in the von Neumann model.

Von Neumann takes the fundamental operator of the brain to be the neuron, and he represents the neuron as a digital device (in spite of its evident analog electrochemical properties). A neuron transmits a pulse. "The nervous pulses can clearly be viewed as (two-valued) markers.... The absence of a pulse then represents one value (say, the binary digit 0), and the presence of one represents the other (say, the binary digit 1)" (42). "The nervous system has a prima facie digital character" (44).

In their introduction to the second edition of CB the Churchlands summarize von Neumann's conclusion somewhat differently by emphasizing the importance of the analog features of the brain: "If the brain is a digital computer with a von Neumann architecture, it is doomed to be a computational tortoise by comparison... [But] the brain is neither a tortoise nor a dunce after all, for it was never a serial, digital machine to begin with: it is a massively parallel analog machine" (kl 397). However, it appears to me that they overstate the importance of analog neural features in von Neumann's account. Certainly vN acknowledges the analog electro-chemical features of neural activity; but I don't find him making a strong statement in this book to the effect that analog features contribute to the better-than-expected computational performance of the brain. This seems to correspond more to a view of the Churchlands than to von Neumann's analysis in the 1950s. Here is their view as expressed in "Could a Machine Think?" in Scientific American in 1990:

First, nervous systems are parallel machines, in the sense that signals are processed in millions of different pathways simultaneously. The retina, for example, presents its complex input to the brain not in chunks of eight, 16 or 32 elements, as in a desktop computer, but rather in the form of almost a million distinct signal elements arriving simultaneously at the target of the optic nerve (the lateral geniculate nucleus), there to be processed collectively, simultaneously and in one fell swoop. Second, the brain’s basic processing unit, the neuron, is comparatively simple. Furthermore, its response to incoming signals is analog, not digital, inasmuch as its output spiking frequency varies continuously with its input signals. Third, in the brain axons projecting from one neuronal population to another are often matched by axons returning from their target population. These descending or recurrent projections allow the brain to modulate the character of its sensory processing. (link, 35)

In considering the brain von Neumann reached several fundamental observations. First, the enormous neural network of the central nervous system is itself a universal computing machine. Von Neumann worked on the assumption that the CNS could be "programmed" to represent the fundamental operations of arithmetic and logic; and therefore it has all the power of a universal computational machine. But second, von Neumann believes his analysis demonstrates that its architecture is fundamentally different from the standard von Neumann architecture. This observation is the more fundamental. It derives from von Neumann's estimates of the base speed rate of calculation available to neurons in comparison to vacuum tubes; a von Neumann machine with components of this time scale would take eons to complete the calculations that the brain performs routinely. And so this underlines the importance of the massively parallel computing that is accomplished by the biological neural network. Ironically, however, it has proven challenging to emulate massively parallel neural nets in digital computing environments; here is an interesting technical report by Paul Fox that identifies communication bandwidth as being the primary limiting factor for such emulations (link).

Wednesday, December 9, 2015

John von Neumann was one of the genuine mathematical geniuses of the twentieth century. A particularly interesting window onto von Neumann's scientific work is provided by George Dyson in his book, Turing's Cathedral: The Origins of the Digital Universe. The book is as much an intellectual history of the mathematics and physics expertise of the Princeton Institute for Advanced Study as it is a study of any one individual, but von Neumann plays a key role in the story. His contribution to the creation of the general-purpose digital computer helped to lay the foundations for the digital world in which we now all live.

There are many interesting threads in von Neumann's intellectual life, but one aspect that is particularly interesting to me is the early application of the new digital computing technology to the problem of simulating large complex physical systems. Modeling weather and climate were topics for which researchers sought solutions using the computational power of first-generation digital computers, and the research needed to understand and design thermonuclear devices had an urgent priority during the war and post-war years. Here is a description of von Neumann's role in the field of weather modeling in designing the early applications of ENIAC (P. Lynch, "From Richardson to early numerical
weather prediction"; link):

John von Neumann recognized weather forecasting, a problem of both great
practical significance and intrinsic scientific interest, as ideal for an automatic computer.
He was in close contact with Rossby, who was the person best placed to understand
the challenges that would have to be addressed to achieve success in this venture. Von
Neumann established a Meteorology Project at the Institute for Advanced Study in
Princeton and recruited Jule Charney to lead it. Arrangements were made to compute a
solution of a simple equation, the barotropic vorticity equation (BVE), on the only
computer available, the ENIAC. Barotropic models treat the atmosphere as a single
layer, averaging out variations in the vertical. The resulting numerical predictions were
truly ground-breaking. Four 24-hour forecasts were made, and the results clearly
indicated that the large-scale features of the mid-tropospheric flow could be forecast
numerically with a reasonable resemblance to reality. (Lynch, 9)

A key innovation in the 1950s in the field of advanced computing was the invention of Monte Carlo simulation techniques to assist in the invention and development of the hydrogen bomb. Thomas Haigh, Mark Priestley, and Crispin Rope describe the development of the software supporting Monte Carlo simulations in the ENIAC machine in a contribution to the IEEE Annals of the History of Computing (link). Peter Galison offers a detailed treatment of the research communities that grew up around these new computational techniques (link). Developed first as a way of modeling nuclear fission and nuclear explosives, these techniques proved to be remarkably powerful for allowing researchers to simulate and calculate highly complex causal processes. Here is how Galison summarizes the approach:

Christened "Monte Carlo" after the gambling mecca, the method amounted to the use of random, numbers (a la roulette) to simulate the stochastic processes too complex to calculate in full analytic glory. But physicists and engineers soon elevated the Monte Carlo above the lowly status of a mere numerical calculation scheme; it came to constitute an alternative reality--in some cases a preferred one--on which "experimentation" could be conducted. (119)

At Los Alamos during the war, physicists soon recognized that the central problem was to understand the process by which neutrons fission, scatter, and join uranium nuclei deep in the fissile core of a nuclear weapon. Experiment could not probe the critical mass with sufficient detail; theory led rapidly to unsolvable integro-differential equations. With such problems, the artificial reality of the Monte Carlo was the only solution--the sampling method could "recreate" such processes by modeling a sequence of random scatterings on a computer. (120)

The approach that Ulam, Metropolis, and von Neumann proposed to take for the problem of nuclear fusion involved fundamental physical calculations and statistical estimates of interactions between neutrons and surrounding matter. They proposed to calculate the evolution of the states of a manageable number of neutrons as they traveled from a central plutonium source through spherical layers of other materials. The initial characteristics and subsequent interactions of the sampled neutrons were assigned using pseudo-random numbers. A manageable number of sampled spaces within the unit cube would be "observed" for the transit of a neutron (127) (10^4 observations). If the percentage of fission calculated in the sampled spaces exceeded a certain value, then the reaction would be self-sustaining and explosive. Here is how the simulation would proceed:

Von Neumann went on to specify the way the simulation would run. First, a hundred neutrons would proceed through a short time interval, and the energy and momentum they transferred to ambient matter would be calculated. With this "kick" from the neutrons, the matter would be displaced. Assuming that the matter was in the middle position between the displaced position and the original position, one would then recalculate the history of the hundred original neutrons. This iteration would then repeat until a "self-consistent system" of neutron histories and matter displacement was obtained. The computer would then use this endstate as the basis for the next interval of time, delta t. Photons could be treated in the same way, or if the simplification were not plausible because of photon-matter interactions, light could be handled through standard diffusion methods designed for isotropic, black-body radiation. (129)

Galison argues that there were two fairly different views in play of the significance of Monte Carlo methods in the 1950s and 1960s. According to the first view, they were simply a calculating device permitting the "computational physicist" to calculate values for outcomes that could not be observed or theoretically inferred. According to the second view, Monte Carlo methods were interpreted realistically. Their statistical underpinnings were thought to correspond exactly to the probabilistic characteristics of nature; they represented a stochastic view of physics.

King's view--that the Monte Carlo method corresponded to
nature (got "back of the physics of the problem") as no deterministic
differential equation ever could--I will call stochasticism. It appears
in myriad early uses of the Monte Carlo, and clearly contributed
to its creation. In 1949, the physicist Robert Wilson took cosmic-ray physics as a perfect instantiation of the method: "The
present application has exhibited how easy it is to apply the Monte
Carlo method to a stochastic problem and to achieve without
excessive labor an accuracy of about ten percent." (146)

This is a very bold interpretation of a simulation technique. Rather than looking at the model as an abstraction from reality, this interpretation looks at the model as a digital reproduction of that reality. "Thus for the stochasticist, the simulation was, in a sense, of apiece with the natural phenomenon" (147).

One thing that is striking in these descriptions of the software developed in the 1950s to implement Monte Carlo methods is the very limited size and computing power of the first-generation general-purpose computing devices. Punch cards represented "the state of a single neutron at a single moment in time" (Haigh et al link 45), and the algorithm used pseudo-random numbers and basic physics to compute the next state of this neutron. The basic computations used third-order polynomial approximations (Haigh et al link 46) to compute future states of the neutron. The simulation described here resulted in the production of one million punched cards. It would seem that today one could use a spreadsheet to reproduce the von Neumann Monte Carlo simulation of fission, with each line being the computed result from the previous line after application of the specified mathematical functions to the data represented in the prior line. So a natural question to ask is -- what could von Neumann have accomplished if he had Excel in his toolkit? Experts -- is this possible?

Friday, December 4, 2015

Think of the following matrix of explanatory possibilities of social and historical phenomena:

Vertically the matrix divides between historical and sociological explanations, whereas horizontally it distinguishes general explanations and particular explanations. A traditional way of understanding the distinction between historical and sociological explanations was to maintain that sociological explanations provide generalizations, whereas historical explanations provide accounts for particular and unique situations. Windelband and the historicist school referred to this distinction as that between nomothetic and idiographic explanations (link). It was often assumed, further, that the nomothetic / idiographic distinction corresponded as well to the distinction between causal and interpretive explanations.

On this approach, only two of the cells would be occupied: sociological / general and historical / particular. There are no general historical explanations and no particular sociological explanations.

This way of understanding social and historical explanations no longer has a lot of appeal. "Causal" and "nomological" no longer have the affinity with each other that they once had, and "idiographic" and "interpretive" no longer seem to mutually imply each other. Philosophers have come to recognize that the deductive-nomological model does a poor job of explicating causation, and that we are better served by the idea that causal relationships are established by discovering discrete causal mechanisms. And the interpretive approach doesn't line up uniquely with any particular mode of explanation.

So historical and sociological explanations no longer bifurcate in the way once imagined. All four quadrants invoke both causal mechanisms and interpretation as components of explanation.

In fact it is straightforward to identify candidate explanations in the two "vacant" cells -- particular sociological explanations and general historical explanations. In Fascists Michael Mann asks a number of moderately general questions about the causes of European fascism; but he also asks about historically particular instances of fascism. Historical sociology involves both singular and general explanations. But likewise, historians of the French Revolution or the English Revolution often provide general hypotheses even as they construct a particular narrative leading to the storming of the Bastille (Pincus, Soboul).

There seem to be two important grounds of explanation that cut across all these variants of explanations of human affairs. It is always relevant to ask about the meanings that participants attribute to actions and social events, so interpretation is a resource for both historical and sociological explanations. But likewise, causal mechanisms are invoked in explanations across the spectrum of social and historical explanation, and are relevant to both singular and general explanations. Or in other words, there is no difference in principle between sociological and historical explanatory strategies.

How do the issues of generalization and particularity arise in the context of causal mechanisms? In several ways. First, explanations based on social mechanisms can take place in both a generalizing and a particular context. We can explain a group of similar social outcomes by hypothesizing the workings of a common causal mechanism giving rise to them; and we can explain a unique event by identifying the mechanisms that produced it in the given unique circumstances. Second, a social-mechanism explanation relies on a degree of lawfulness; but it refrains from the strong commitments of the deductive-nomological method. There are no high-level social regularities. Third, we can refer both to particular individual mechanisms and a class of similar mechanisms. For example, the situation of "easy access to valuable items along with low probability of detection" constitutes a mechanism leading to pilferage and corruption. We can invoke this mechanism to explain a particular instance of corrupt behavior -- a specific group of agents in a business who conspire to issue false invoices -- or a general fact -- the logistics function of a large military organization is prone to repeated corruption. (Sergeant Bilko, we see you!) So mechanisms support a degree of generalization across instances of social activity; and they also depend upon a degree of generalization across sequences of events.

And what about meanings? Human actions proceed on the basis of subjective understandings and motivations. There are some common features of ordinary human experience that are broadly shared. But the variations across groups, cultures, and individuals are very wide, and there is often no substitute for detailed hermeneutic research into the mental frameworks of the actors in specific historical settings. Here again, then, explanations can take the form of either generalized statements or accounts of particular and unique outcomes.

We might say that the most basic difference between historical and sociological explanation is a matter of pragmatics -- intellectual interest rather than fundamental logic. Historians tend to be more interested in the particulars of a historical setting, whereas sociologists -- even historical sociologists -- tend to be more interested in generalizable patterns and causes. But in each case the goal of explanation is to discover an answer to the question, why and how does the outcome occur? And this typically involves identifying both causal mechanisms and human meanings.

About Me

I am a philosopher of social science with a strong interest in Asia. I
have written books on social explanation, Marx, late imperial China,
the philosophy of history, and the ethics of economic development.
Topics having to do with racial justice in the United States have become
increasingly important to me in recent years. All these topics involve
the complexities of social life and social change. I have come to see
that understanding social processes is in many ways more difficult than
understanding the natural world. Take the traditional dichotomy between
structure and agency as an example. It turns out that social actions
and social structures are reciprocal and inseparable. As Marx believed,
“people make their own histories, but not in circumstances of their own
choosing.” So we cannot draw a sharp separation between social structure
and social agency. I think philosophers need to interact seriously and
extensively with working social researchers and theorists if they are to
be able to help achieve a better understanding the social world.

Open source philosophy

This site addresses a series of topics in the philosophy of social science. What is involved in "understanding society"? The blog is an experiment in thinking, one idea at a time. Look at it as "open-source philosophy" -- a web-based, dynamic monograph on the philosophy of social science and some foundational issues about the nature of the social world.

Recent publications

Currently reading ...

Digital editions of Varieties of Social Explanation

Digital editions of Varieties of Social Explanation are now available on Kindle and iBooks for iPad. This edition contains the original text of the 1991 edition along with an extensive new introduction, "Philosophy and Social Knowledge."