June 3, 2006

Causation is a topic that impacts many areas of philosophical and scientific discourse, for example in the philosophy of mind mental causation is the source of much debate. Unfortunately the nature of causality itself remains hard to pin down, even though in general when asked what the cause of an event is people can find something they agree on. Our confusion about causality partly stems from the way in which we associate agency with causality. In general we associate the active elements of the world as things that can be at fault and so we tend to also describe them as causes. For example if a baseball is thrown against my window, and the window breaks, our intuition is to say that the moving baseball is the cause of the broken window. However if we distance ourselves from the association between agency and causality it should seem reasonable to say that the disposition of the window to break is also part of the cause of the broken window, after all if the window had been stronger the ball would have simply bounced off. Another source of confusion when studying causality is time. At different times different events might be seen as the cause of the current state of affairs. For example earlier I described the ball as a cause of the window being broken. However if we look further back in time it would seem that the person who threw the ball was a cause of the broken window, and further back than that their parents are the cause of the person’s existence, and thus the cause of the broken window, and ultimately the big bang is the cause of everything. So if our use of causality is to make any sense it must be associated with a specific time so that we can ask “what properties at this time were the causes of the result in question”.

Of course I am not the first person to ask such questions about causation. A common definition of causality is that X is the cause of Y if it instantiates a causal law. On the surface this might seem like a good definition, but there are three major problems with it. One problem is that the causal laws, as defined by physics, don’t operate on large scale objects, only on individual particles. There are no laws in physics dictating that in general balls thrown at windows will cause them to break (nor could there be, because in some cases the ball will bounce off). A second problem is that this definition of what a cause is doesn’t tell us at what time we should be considering events that fit into casual laws. If it is simply any earlier event that could instantiate such a law is a cause then we end up with an overabundance of causes (i.e. the ball 1 second ago, the ball two seconds ago, …, the industrial revolution, …, the big bang). Finally this definition fails to tell us what counts as a casual law. Intuitively we might say that a casual law is one that defines cause and effect, but of course to define it in this way would be circular.

2: A Partial Solution

To fix these problems let me propose, roughly, a new definition of what a cause is, which will be refined in the course of this essay. I propose that if we are looking for causes for event X at time T then Y is a cause of X if and only if removing Y from the universe at time T would result in the failure of X to occur. Obviously this is a little dense, so let me illustrate with an example. Once again let us consider the broken window. If we are looking for the cause of the broken window, say 2 seconds before it broke, then we might reason as follows: the ball counts as a cause of the broken window, because if we removed the ball from existence before it hit the window the window would have failed to break. Likewise the window is a cause of the current broken window, because if the window had been removed from existence there would be no broken shards of glass on the ground.

We can build on this definition to recover our use of causation in everyday language. I think that most people would agree that we should consider the person who through the ball to be the “cause” of my broken window (in the sense of agency). We can arrive at this conclusion using the definition above by simply considering the cause of the broken window at earlier and earlier times until we find a person. The person that we find we label as the “cause”. This works in more cases than the one presented above. For example consider a slight modification to the story I have been telling. In this story the hole in the side of my house has no pane of glass in it (it has yet to be installed). Seeing that there is nothing in its way the neighbor’s boy decides to throw his ball through the hole. Mid flight however I pick up a pane of glass and fit it into the hole. The ball then hits the glass and breaks it. Now when we search for a “cause” the first person we encounter is me (the glass is a cause of the broken window, and my actions are a cause of the glass being in its current location), and thus the definition above agrees with our intuition that in this case I am the “cause” of the broken window.

3: Cases to Consider

Of course the examples I have been giving are simple. How should our definition capture unusual cases, such as when two events combine to cause something that neither alone could accomplish? To get our intuitions flowing I have created four different situations involving different colored particles interacting.

Case 1:
This example is like the case of the broken window. If we asked “what is the cause at time T1 of the current position of the red particle at time T3” we would have to answer that the red particle is the cause and the blue particle is the cause, because with either one missing the red particle would end up somewhere else (or go missing).

Case 2:
In this example the following rules are in place: When a green particle interacts with a red particle it turns the red particle green. Green particles have no effect on green particles. If we ask what is the cause at time T1 of the green particle at time T5 is we can’t say that the cause is either of the green particles at time T5, because even with one of them removed nothing happens. However it does make sense to say that together they both are a cause.

Case 3:
In this example the following rules are in place: When a light blue particle interacts with a red particle it turns the red particle yellow. When a light blue particle interacts with a yellow particle it turns it red. In this case either light blue particle can individually be seen as the cause of the red particle at time T5, but both of them together are not a cause (the opposite of case 2).

Case 4:
In this example the following rules are in place: When a purple particle interacts with a red particle it turns the red particle orange. When a purple particle interacts with an orange particle it turns it yellow. When a purple particle interacts with a yellow particle nothing happens. In this case none of the purple particles can individually be seen as the cause of the yellow particle at time T5, but any two of them can be.

4: Formalization

Now I will attempt to formalize these concepts mathematically, which will require some set theory. If you don’t care about the formalization I suggest you skip down to the possible objections below.

First let us define a function T that operates on two parameters. The first parameter is an amount of time and the second parameter is a set that contains all the basic components of the universe at a single instant in time. The result is a set that contains all the basic components of the universe that many units of time in the future. The exact operation of the T function could be defined using the laws of physics, but here its exact results are irrelevant.

For the following definitions w1 is a set of the basic components of the universe at some moment in time and n is a number that represents an amount of time. Also
z is the state or property that we are searching for a cause for.

Now we define the predicate DC, which we might think of as direct causation as follows:
This embodies the requirement for being a caused that we described roughly above, that q is only a cause of z n time units in the future if z wouldn’t exist when we remove q from the initial state.

Next is to define the set C(n, z), which is the set of all causes of z n units of time into the past. We can define this set by defining which elements count as members as follows:

This says that q will be an element of C(n, z) if it meets all of several requirements.
One requirement is that:
This means that q must be a member of the power set of w1. The power set is defined as the set of all possible combinations of elements of a set. Thus q is some combination of elements that are part of w1. q could be a set containing a single element, or it could contain many elements, but those elements are all part of w1.
The second requirement is that:
This means that q must contain at least one element. (Nothing is never a cause of anything.)
The third requirement is that:
This means that q must be a direct cause of z, so that if we took all the elements in q out of w1 z would not occur.
Finally q must meet one of two requirements. q must either be:
This means that any combination of the elements that make up q (so long as that arrangement contains at least one element, and that it does not contain all the elements of q) must also be one of the causes of z. I call this the compound cause requirement, meaning that we can consider a group of “particles” a cause so long as it as a whole is a cause and all its members are causes.
Or alternatively q must:
This means that any combination of the elements that make up q (as long as it does not contain all the elements of q) must not be one of the causes of z. I call this the minimal cause requirement, meaning that we can consider a group of “particles” to be a cause so long as no smaller part of that group could also be considered a cause.

5: Application to Cases

Now that all the math is out of the way let’s go apply our definitions to the cases I presented earlier. Case 1 is relatively simple:
Just as we would expect both the red and blue particles are causes, as well the combination of red and blue taken together.

Case 2:
This also agrees with our intuitions, the red particle is a cause, as are both the green particles, but neither green particle individually is a cause.

Case 3:
This may look a bit odd, but it makes sense when you think about it. The light blue particles individually are causes, but both of them together are not a cause.

Case4:
This is a lengthy case, but it agrees perfectly with our intuitions. No one purple particle is the cause (nor is all three since it isn’t minimal), but any two purple particles are. The red particle of course is a cause as well, like usual.

6: Objections

Now that I have presented what I think is a good description of causation it is time to consider some objections. Physicists may object to this description because current physical laws are not deterministic. When we apply the laws we don’t get a fixed description of the future state of the universe, instead we get many possible futures, all of which have a different probability of occurring. Fitting indeterminism into the picture is not actually that complicated. Instead of defining the DC predicate by what can be removed to make z not happen we would define DC in terms of what we can remove to make z less likely. The rest of the definitions, and our conclusions, can remain unchanged.

Still some might object, claiming that this definition of causality is conventional. Real causation is universally true, in the sense that people’s beliefs do not affect what is and is not a cause. However because T is defined in terms of physical laws, and physical laws may not be universally accepted, different people would identify different objects as the cause of an event. The answer to this of course is that T should not be defined in terms of the physical laws we know but in terms of the real laws that determine how the universe actually changes over time. Of course we don’t currently know what those laws are, so when we use our current understanding of the universe to define T the results we get are only an approximation. Thus the early Greeks would have claimed that prayers to Zeus caused the famine to end, as a result of how they defined T, but we would not, and are probably right to do so since our definition of T is closer to the real laws.

Finally physicists may object to my definition because I rely on a set that contains all the components of the world at a fixed time. However we know from special relativity that observers moving at different speeds will disagree about which events are simultaneous, and thus will define such a set differently. I say that this is perfectly acceptable. Yes it is possible then that observers in different frames of reference will attribute different causes to the same event. This result is in perfect agreement with modern physics, which arrives at the same conclusion. So in reality this objection was no objection at all.

7: General Application
Of course my definition of causality has only been shown to apply to the fundamental components of reality, and any practical definition of causation should allow us to draw conclusions about macroscopic objects. We go about this in the same way that physics draws conclusions about macroscopic object from the smallest fundamental components, which is to say that we reduce macroscopic claims to microscopic ones. Let us go back to the example with the ball and the broken window. What we are hoping to conclude is that C(few seconds, broken-window) = {{window}, {ball}, {window, ball}}. First we must decompose our large scale objects into sets of component particles. We can’t just assign one set to each object though, we must define an object as a combination of sets, each of which is made of some subset of the complete description of the object in terms of fundamental particles and would behave the same way as the complete object. When cast in this light we can indeed conclude that C(3, {bw1, bw2, …} = {w1, w2, …, b1, … b2, …, {b1, w1}, {b1, w2}, …, {b2, w1}, {b2, w2}, …}. However, just as physicists rarely break the macroscopic into the microscopic in order to analyze it, so would we rarely consider this long description when thinking about everyday causation.

Now that we have a theory of causation, let’s apply it to some real life situations. For example we can consider the straw that broke the camel’s back. If we go back immediately before the last straw was added our analysis revels that C(just before, broken back) = {burdened camel, last straw, {burdened camel, last straw}}, meaning that the last straw is a cause of the broken back. If we go back before any of the straws were added though our analysis reveals that C(way before, broken back) = {camel, {staw1, straw2, …}}, that is the straws as a whole (and the camel) broke the camel’s back, not any individual straw.

This method can also resolve some ethical dilemmas. For example three men are wandering in the dessert. One night the first man poisons the third man’s water. Later that night the second man drills a hole in the third man’s canteen. Later the third man dies from thirst. Which of these men are the cause of his death? C(before the night, dead man) = {third man, {first man, second man}}. Since we wouldn’t blame the third man, as he didn’t do anything ethically wrong (needing water to live isn’t a fault), we should consider the first man and the second man to be equally the cause of his death.

Finally let’s consider a famous question in physics, namely what caused the big bang? If we assume that there was literally nothing before the big bang (no space, no time, no energy, no matter), as most physicists do, then our definition of causation reveals that it is meaningless to ask about the cause of the big bang, because to have a cause there must be an previous state of the universe which we can consider removing elements from. Since there isn’t such a state we can’t say that the big bang has a cause.