Note: This is answered by user974514 below, but there was some discussion outside of the "answer", so I paraphrased the final answers inline here.

I've asked around for the exact usages of the terms "quantile" and "percentile" and "rank" and I'm getting conflicting answers from my colleagues. I'll just pose a set of questions to narrow down the exact terminology.

1) Given: a probability CDF. $\mathbb{P}(X \leq A) = B$. $X$ is the random variable. Domain of $B$ is obviously real number in interval $[0, 1)$, and $A$ can be any real number (or other totally-ordered set). Is "quantile" or "percentile" applicable to either $A$ or $B$ (which one)? If $A$ is called a "quantile" (or "percentile"), than what is $B$ called - a "[quantile/percentile] rank", or just "cumulative probability"?

A: $A$ is a quantile. $B$ is the cumulative probability. If $B$ is a multiple of 1/100, then the quantile is a percentile (a 100-quantile). When discussing $n$-quantiles in the CDF for various $n$ that have the same value $A$, it's helpful to be explicit with term "value", i.e. $A$ is the "quantile value".

2) Is the term "quantile" only applicable for equally-sized intervals in a CDF? Or can it be for any arbitrary $A$ (or $B$; depending on answer to above question) such that $\mathbb{P}(X \leq A) = B$? For example, what if the probability $B$ is an irrational number, so that it's impossible for it be an interval boundary of equally-sized intervals in a CDF?

A: For a CDF, quantiles must be points on equally-sized intervals of that CDF.

3) Given: list of $N$ sample datums (of a totally-ordered set). If I want a value $A$ such that $B$ datums are at most $A$, and assuming that $A$ is a value in the $N$ sample datums, where do the terms "percentile" and "quantile" and "rank" fit in here? For example, if I want the value such that 50% datums are at most this value, can that value be called any of: 50th percentile, 2nd quartile, median? Or does it have to be in terms of $N$ quantiles, e.g. 3rd $N$-quantile?

(I'm aware that if $A$ were not in the list of sample datums, there would have to be some rounding or interpolation, but that's not important to these questions.)

A: Rephrased: given $N$ sample datums, if $A$ is the $k$-th $N$-quantile, then $A$ is not less than $\frac{k}{N}$ values out of the $N$ values. Assuming the samples are taken randomly, $\frac{k}{N}$ is the cumulative probability. $\frac{k}{n} \cdot N$ (with rounding) is the rank for $n$-quantiles, e.g. $\frac{k}{100} \cdot N$ is the percentile rank.

3a) Can I define percentiles in term of fractions, and if so, what is that fraction called? e.g. in the above example, if the 50-th percentile is called the "percentile", then what is the 0.5 called? This is analogous to what I called "cumulative probability" in the CDF case.

A: Assuming samples are taken randomly, the "fraction" (see above) is the cumulative probability.

4) Along the same lines, when I have a value that is called the "P90", what exactly is that - the 90th percentile? Can it be called the 9th decile too? How about 0.9 [something]?

A: Except for "0.9", they are all equivalent - just different $n$-quantiles with the same quantile value (and cumulative probability). The 0.9 is the cumulative probability.

5) Is it valid to have a non-integral quantile/percentile? e.g. 55.55th percentile or 2.5th quartile?

A: Yes, but given non-integral $k$ and given $k$-th $n$-quantile, you'd typically scale both to some $(k \cdot m)$-th $(n \cdot m)$-quantile, where $(k \cdot m)$ is integral. For example, 2.5-th quartile is the same as the 5-th 8-quantile. A common exception would be the "Px" notation, where things like "P99.9" are common.

1) Quantiles are actually what you defined as $A$. I mean if you want to find third quartile you will have to calculate A such that $P(X\leq A) = 0.75=\frac{3}{4}$ in this case you also get your B defined. Although there's no quantile related definition of $B$ that I would know, but finding that $A$ is more than $0.75=\frac{3}{4}$ of all data makes it 3rd quartile or 3rd - 4-quantile or also 75th-percentile or 75th-100-quantile. So $B$ can actually define $A$. Suppose $P(X\leq A)= B = \frac{k}{n}$ so $A$ is k-th n-quantile.

2) Yes, only equally-sized intervals. Concerning your example suppose we have $B$ equal to $\frac{\sqrt{2}}{2}$ or Then you will have 2-quantile, but it will exceed the rest part of quantile, I mean $\frac{\sqrt{2}}{2}\geq 1 - \frac{\sqrt{2}}{2}$ and that's contradiction.

3)Ok, so I supposing you have sample of N datums. k-th n-quantile is a value $B$ that is not less than $\frac{k}{n}$ part of all data. In your case that's $\lfloor \frac{k}{n} \cdot N \rfloor$ values that will be less or equal than $B$. Now you lineup all of your data from smallest to biggest datapoint in non-descending order and enumerate each of them from $1$ to $N$. That's so called ranking procedure. If $N$ is odd you just take $\lfloor \frac{N}{2} \rfloor +1 $-th datum and it will be bigger than half of all data or will be so called median. The same is goes for 50-th percentile. So actually 50th percentile = 2-nd quartile = median. So that 50% you're talking about is called median. And this is the most valid way to call it (rather than calling it half or 50% or I didn't understand the question).

5)Non-integer quantiles may be defined, but it actually makes no sense in terms of equally-sized definition. But you can convert them to proper quantiles. I mean $2.5$ quartile means $\frac{2.5}{4}=\frac{5}{8}$ so it's 5-th 8-quantile.

Hope my answer will makes you more clear about this things if not you're free to ask.

Thanks, this answers helps a lot. Regarding the fractions, I'm asking whether I can define percentiles in term of fractions, and if so, what is that fraction called? e.g. if the value at 90th percentile is called the "percentile", then what is the 0.9 called? (I'll also clarify the question.)
–
MaianJun 20 '12 at 19:24

(Ugh I keep pressing enter to create another paragraph for another question, but that instead finishes the edit, so adding another comment instead.) Are you sure that "there's no quantile related definition of B that I would know"? I've been calling it "cumulative probability" for a while - is that actually valid? Likewise, for the list of datums case, what would analogous k/n (= rank/N) be called?
–
MaianJun 20 '12 at 19:35

Can't actually get the first question. What do you mean can a percentile be defined in terms of fractions? In this case $0.9$ is just the value of $B$. I mean if you have 90-th percentile or 90th 100-quantile or other k-th n-quantile, the 0.9 comes from $\frac{k}{n}$ there's no any other name for this phenomenon. In either way if you're asking can percentile be defined by fraction $0.9$ i would guess that's not true, because $0.9$ also defines 9th 10-quantile, 90th 100-quantile, 900th 1000-quantile and so on...
–
user974514Jun 20 '12 at 23:55

Regarding the second question. Yes, it's called cumulative probability, that's correct, it's either the integral (for continuous data) or a sum of descrete. But I meant the definition that have word "quantiles" in it. $k/n$ doesn't make any other sense than in my previous comment. If you'll have more question, you're free to ask but try to check the wiki page also.
–
user974514Jun 21 '12 at 0:06

Ah en.wikipedia.org/wiki/Quantile#Quantiles_of_a_population does state that the k/n could be replaced with a p probability. That "fraction" I'm talking about is the cumulative probability. One more question to verify: in either the continuous case (CDF) or the discrete case (set of N sample datum), I can call this "fraction" (or B value in the examples) the "cumulative probability", right?
–
MaianJun 21 '12 at 4:58