Mathematicians think of probabilities as numbers in the closed interval from 0 to 1 assigned to "events" whose occurrence or failure to occur is random. Probabilities are assigned to events according to the probability axioms.

The probability that an event occurs given the known occurrence of an event is the conditional probability of given; its numerical value is (as long as is nonzero). If the conditional probability of given is the same as the ("unconditional") probability of , then and are said to be independent events. That this relation between and is symmetric may be seen more readily by realizing that it is the same as saying
when A and B are independent events.

Contents

A somewhat more abstract view of probability

Mathematicians usually take probability theory to be the study of probability spaces and random variables — an approach introduced by Kolmogorov in the 1930s. A probability space is a triple , where

is a non-empty set, sometimes called the "sample space", each of whose members is thought of as a potential outcome of a random experiment. For example, if 100 voters are to be drawn randomly from among all voters in California and asked whom they will vote for governor, then the set of all sequences of 100 Californian voters would be the sample space Ω.

is a σ-algebra of subsets of whose members are called "events". For example the set of all sequences of 100 Californian voters in which at least 60 will vote for Schwarzenegger is identified with the "event" that at least 60 of the 100 chosen voters will so vote. To say that is a σ-algebra implies per definition that it contains , that the complement of any event is an event, and that the union of any (finite or countably infinite) sequence of events is an event.

If is denumerable we almost always define as the power set of , i.e which is trivially a σ-algebra and the biggest one we can create using .
In a discrete space we can therefore omit and just write to define it. If on the other hand is non-denumerable and we use we get into trouble defining our probability measure because is too 'huge', i.e. there will be sets to which it will be impossible to assign a unique measure, e.g the Banach–Tarski paradox. So we have to use a smaller σ-algebra (e.g. the Borel algebra of , which is the smallest σ-algebra that makes all open sets measurable).

A random variable is a measurable function on . For example, the number of voters who will vote for Schwarzenegger in the aforementioned sample of 100 is a random variable.

If is any random variable, the notation , is shorthand for , so that "" is an "event".

Philosophy of application of probability

Some statisticians will assign probabilities only to events that are random, i.e., random variables, that are outcomes of actual or theoretical experiments; those are frequentists. Others assign probabilities to propositions that are uncertain according either to subjective degrees of belief in their truth, or to logically justifiable degrees of belief in their truth. Such persons are Bayesians.

A Bayesian may assign a probability to the proposition that 'there was life on Mars a billion years ago,' since that is uncertain, whereas a frequentist would not assign probabilities to statements at all. A frequentist is actually unable to technically interpret such uses of the probability concept, even though 'probability' is often used in this way in colloquial speech. Frequentists only assign probabilities to outcomes of well defined random experiments, that is, where there is a defined sample space as defined above in the theory section. For another illustration of the differences see the two envelopes problem.