Some pondering leads me to the question below, which prevents me from the reckless calculation of conditional probability...

As defined, conditional probability is:

$$
P(A|B)=P(A\cap B)/P(B).
$$

So, we can see conditional probability is the ratio of 2 probabilities. If we consider conditional probability also as probability, we are literally saying some quantity describes the similar thing as its ratio. This is very bizarre because when we measure length, we can say something is 2 meters long and the other thing is 5 meters long. But we cannot consider 2/5 or 5/2 as the same thing as 2 or 5 because the ratio 2/5 or 5/2 is just another level of comparison as I understand， while the 2 or 5 is merely the comparison to the unit.

So why do we still treat conditional probability just as the ordinary probability?

I wish someone could shed some light on this thing. Or is there any other examples like this besides in probability theory?

ADD 1

(Thanks for so many responses.)

Almost all of the responses so far try to convince me that probability is a ratio itself. And both ratio and ratios' ratio are ratio.

Personally, I haven't found any contradiction about the conditional probability definition as far as only ratio interpretation is concerned yet. And it seems ratio is the only realm where probability is mathematically possible. So I took it that people generalize this conclusion with ratio to a much broader realm where probability issues also arise but no apparent mathematics is applicable (such as the degree-of-belief scenario.). The only thing I can find to support this for now, is the Principle of the Permanence of Equivalent Forms and the boldness of human nature for mathematical generalization.

(I will not close this question as of now and more opinions are appreciated.)

I'm guessing that for you $\;AB=A\cap B\;$...right?
–
DonAntonioJan 31 '14 at 14:11

Yes, indeed. I don't know how to input that symbol. Now I can copy from you. ;)
–
smwikipediaJan 31 '14 at 14:12

I don't really understand the nature of the question. Are you asking how it's possible for two things of a common type (probabilities) to be divided and produce a quotient that is of the same type as the divisor and dividend? I also have no idea what your update means. Are you saying you're not yet convinced that probability is a ratio? (It's defined that way...)
–
Kyle StrandJan 31 '14 at 23:06

Yes, my doubt is about the quotient being of the same type as the divisor and divided. I can think ratio as the ultimate/subconscious origin of probability, but in some context, the ratio is not calculatable, such as the degree-of-belief scenario. But we just generalize our theory for all scenarios.
–
smwikipediaFeb 1 '14 at 14:19

10 Answers
10

As the other respondents have already noted, probability is already defined as a ratio. But here is another thing to think about.

Normally when you take the ratio of two quantities, the result is measured in the units of the ratio of the units. For example, if you travel 10 meters in 5 seconds, your average velocity is 10/5 meters per second.

But because probability is defined as a ratio of two quantities that share the same unit, it is dimensionless. The probability of tossing a head (on one toss of a fair coin) is one half. It is not one half of toss, or one half of a prob (or 500 milliprobs, for that matter). It is just one half.

(Think of it as the long run ratio: The proportion of heads will be $n/2$ tosses per $n$ tosses. Or, if you prefer a more subjective approach, the fair price for a ticket that will win one dollar if the coin comes up heads is one half a dollar. The probability is thus defined as half a dollar per dollar, which again is just one half.)

It is automatically the case that the ratio of probabilities is also dimensionless. So, the ratio of probabilities has the same unit (i.e., nothing) as a single probability. This is in stark contrast to, say distance, where the ratio of two distances is dimensionless, and so evidently a very different thing to a single distance.

The formal definition of probability helps to elucidate this confusion. Think of it as a measure of subsets elements in a set, in such a way that the whole set measures up to one. More formally, these are the axioms of a probability:

So you can understand conditional probability as simply changing your sample set S to a smaller one. We will create a new probability distribution that shall be proportional to the old one, but the sample set shall now be B. How should that new probability distribution be defined? Well, obviously the natural answer is:

I agree that we need to change the sample set S to a smaller one specified by the condition. But after that, we should get the probability P'(A) the same way as we get the P(A), rather than simply calculate the ratio of P(A∩B)/P(B).
–
smwikipediaJan 31 '14 at 14:51

We get a probability P'(A). We just decided to call it P(A|B) instead, to make it more meaningful. The way we decided to calculate this new distribution is just so it could attain certain properties that would be interesting. That this calculation involves a division is hardly relevant, in my understanding
–
ZadoJan 31 '14 at 15:16

Conditional probabilities are very natural probabilities. We ask, given that some event $B$ has occurred, what is the probability that $A$ happens as well? The answer is $\frac{P(A\cap B)}{P(B)} = P(A|B)$.

So why is the conditional probability a probability in the first place? Because the function $P(-|B)$ defines a probability measure.

I highly recommend you to read "Probability with Martingales" by David Williams. Chapter 9 for 3 pages you will have a perfect understanding of what is conditional probability, and why we need it, why it called "conditional"

Thanks for giving me a hint. I will check that book and get back to you later.
–
smwikipediaJan 31 '14 at 15:42

I just take a brief look at the Ch.9 of that book. It seems I need to read it from the start to comprehend it. It could take a while. Could you summarize a bit about the point?
–
smwikipediaJan 31 '14 at 15:51

I strongly suggest to read PWM from the start. The book is small, the font is large, the language is congenial although the insights it provides are deep, hence no minute you will spend reading it (and solving the exercises) would be lost.
–
DidFeb 1 '14 at 21:42

This is one of those fun areas where the logic of mathematics crosses back into the arena of philosophy from which it spawned. We can look at this many ways, and here are a couple of the more common ways that come to mind:

A part of a part of an apple is a part of an apple.

A probability, as commonly calculated, is the logical part of an observed set that is seen to be in common with itself and not in common with others in the observed set. We represent the probability mathematically as a proportion, which strips it of all units. The probability instead becomes a knife with which we separate one part from another part. Any multiplication or division done to the probability, is another separation done by the knife and does not change the nature of the knife nor of the whole that the knife cuts. The conditional probability then is merely two separations represented in a single statement, and the only change is in the size and shape of the resulting part of the whole.

A portion of a hole is a complete hole.

A probability, as commonly applied, becomes binary. Choices are weighed, and the likelihood of specific outcomes is considered. Regardless of probabilities theoretically adding up to a whole, each choice is assigned a binary condition of likelihood, either as likely or as unlikely. The probability itself contains nothing and means nothing when in isolation from context, and so a conditional probability contains an equal portion of nothing and means an equal amount of nothing.

The Kolmogorov axiomatization of probability ("Foundations of the Theory of Probability, 1933, initially in German), which is the most widely used fundamental formal approach to probability today (the measure-theoretic approach), needs to be extended/generalized in order to accommodate conditional probabilities -and I mean generalized in a formal mathematical way, and not "to boldly go where others fear to tread". And this is because conditional probability involves unbounded measures (although itself remains bounded in $[0,1]$).

When a deadend is encountered, we should look back to see if the start point is incorrect/too limited. This seems to be a promising answer. I will read that article when back home. Thanks!
–
smwikipediaFeb 2 '14 at 5:48

I haven't read Renyi's article yet. But indeed, I did considered that conditional probability should be more general. And probability is inherently a relative concept. Since it is human nature to consider things in a relative way. When we estimate the probability of something, we are actually (implicitly) compare the possibility of that thing to the possibility of something we are certain about. It is through such comparison that we get some feeling/estimation about the degree of possibility/belief.
–
smwikipediaFeb 2 '14 at 12:38

1

E.T. Jaynes' "Probability: The Logic of Science" (2003) is a third important source. On the surface Jaynes was always thought of as a "Bayesian". But at the end of this posthumous book (where, to put it informally, he "re-establishes" the foundations of probability from primitive consistency principles), he mentions how much "in agreement" is his approach to that of Kolmogorov (while also saying that essentially our inference is always "conditional on the information available"), thus linking to Renyi's axiomatization.
–
Alecos PapadopoulosFeb 2 '14 at 13:59

Much appreciate your supplement. I will take some time to digest them. (BTW, is there any free download of Renyi's thesis?)
–
smwikipediaFeb 2 '14 at 15:03

I held the same opinion before. And I almost lived with it. But the ratio interpretation is just one of the many incompatible interpretations of probability. And it cannot fit into the degree-of-belief scenario where probability is also often used.
–
smwikipediaJan 31 '14 at 14:29

1

"Most" probabilities are not ratios in any canonical way. They are the images of measurable sets under probability measures.
–
nomenJan 31 '14 at 14:30

First of all, to have a concept of probability, you need what is called a sample space. A sample space is (roughly speaking) the set of individual outcomes that can happen.

Second of all, a probability, unlike a length or a period of time, is explicitly defined to be a ratio, so it has no units. When you say $P(A)$, the probability of $A$, what you really mean is
$$
\frac{\text{Number of ways } A \text{ can occur}}{\text{Total number of things that can occur}}
$$
You can see that whatever the units are, they cancel. You have some number of ways, divided by a different number of ways, and--just like with meters divided by meters or seconds divided by seconds--the units disappear in the final result.

So, what does this have to do with conditional probability? Conditional probability uses a different sample space than what you were using before. $P(A)$ takes as a sample space the set of all possible outcomes. But $P(A|B)$ takes as a sample space only those outcomes where $B$ occurs. So,
$$
P(A|B) = \frac{\text{Number of ways } A \text{ and } B \text{ can occur}}
{\text{Total number of ways } B \text{ can occur}}
$$

is exactly the same calculation we did to compute $P(A)$, except that this time, we are only counting the outcomes where $B$ occurs.
That is, $P(A)$, $P(A|B)$, $P(A|C)$, etc. are really the same concept--the $|B$ or $|C$ is just a specification to let the reader know what sample space you're talking about.

In summary,

Probabilities are defined as ratios, so by definition they can't have units.

$P(A)$ and $P(A|B)$ differ only in that they use a different sample space.

Where's the ratio when I say "The chances for my sister to get married this year is 80%". Thanks but your answer is just similar to the other one regarding the ratio interpretation of probability.
–
smwikipediaJan 31 '14 at 14:43

@smwikipedia When a weather person says "the chances it will rain tomorrow are 80%", there is an explicit ratio going on behind the scenes. They have looked at all the times in the past when the conditions and events leading up to a day were similar to the conditions today, and then they have calculated then in 4 days out of 5, in those past cases, it proceeded to rain on the next day.
–
6005Jan 31 '14 at 14:53

The sister example is not necessarily a good one because it is dubious whether you are talking about a real, mathematical probability when you say that. However, if you were talking about a mathematical probability, it would be the same as with the weather person--in 4 cases out of 5, a person is such and such a situation has gotten married in the past. Cases divided by cases, which is a ratio.
–
6005Jan 31 '14 at 14:56

Thanks. By the sister example, I am referring to the degree-of-belief scenario where probability also arises.
–
smwikipediaJan 31 '14 at 15:11

@smwikipedia To elaborate on Goos' comment, % is (usually) read as "percent", or more clearly, "per cent", where cent = 100. Thus, it defines an implicit ratio, in this case 80 out of 100 (which of course simplifies to the 4 out of 5 Goos mentions). In short, $P(X) = (100 * P(X))\%$.
–
JABJan 31 '14 at 16:46

Conditional probability is still a probability, as it is simply the chance that something happens given that something else has happened.

As an example let's consider an election, where you can either vote "Yes" or "No", and we categorize the voters based on gender. Imagine that we have the following probabilities (where each cell represents the percentage of the total votes):

M F
Y 10% 45%
N 30% 15%

In this case, if we select a random vote there would be a $45\%$ probability that it would be "No". However, if we are told that it is from a male voter, the probability would suddenly rise to $75\%$, because we restrict ourselves to look at the $40\%$ males. In other words, what we are actually asking is: "How large a proportion of the male voters voted No?", which is exactly $\frac{30\%}{40\%}=75\%$. I hope this makes it a bit clearer.

Thanks. This is indeed an vivid/reasonable example. It could be seen as we changed to a new universe when considering the condition. But how could we use the value from the old universe in the new universe?
–
smwikipediaFeb 1 '14 at 14:37

@smwikipedia: We can compare the two probabilities and check if there is a correlation between gender and vote. In the example the overall percentage of "No"-votes is $45$, whereas it is considerably higher for men. We conclude that men are more negative towards the subject of the vote when compared to the rest of the population. On the other hand, if the probabilities had been the same, our conclusion would be that gender does not seem to affect a voter's decision.
–
René B. ChristensenFeb 1 '14 at 20:46