Sorry for a silly question, but it seems like only you can answer this question.
What's is the concept of Probability distribution, what's the meanining behind this term. Why we need probability function if we already have pdf (probability density function) and pmf (probability mass function). What's the new new information PD provide us with. Why we use it's definition $P(X<a)$ with less sign?

3 Answers
3

If $X$ is a random variable, then the (cumulative) distribution function $F_X(x)$ of $X$ is ordinarily defined by
$$F_X(x)=P(X\le x).$$

In the discrete case, if you know the probability mass function, you can compute the cumulative distribution function by adding. In the continuous case, if you know the density function, you can get the cumulative distribution function by integrating.

In more advanced work, generalizations of the cumulative distribution function are more natural than generalizations of probability mass function or density function. Indeed, in some situations, suitable generalization of these is impossible. But I will try to give a partial answer to your question in more or less familiar terms, without dragging in advanced matters.

Suppose, for example, that we are trying to make a probability model of the distribution of weights of people, or of the waiting time between consecutive buses. It is common and natural to use a continuous distribution as a model.

We are essentially never interested in the probability that the weight of a person is exactly $p$ pounds, where $p$ is a real number, like $50\pi$. For one thing, according to the model, this probability is $0$!

What we are typically interested in is the probability that the weight $W$ lies in a certain range, say greater than $200$ pounds, or between $140$ and $160$ pounds. That is information that can be easily obtained from the cumulative distribution function.

Even in the discrete case, we are often not all that interested, at the practical level, in the probability mass function. For instance, we are seldom interested in the probability that out of a population of $1000$, exactly $339$ are in favour of a certain candidate. We are more typically interested in the probability that a random variable lies in a certain range. The cumulative distribution function enables us to calculate this easily.

In later calculations, you will be interested, for example, in the distribution of a sum of two or more random variables. The cumulative distribution function will be very useful in this work.

Finally, let's make an analogy from basic physics. If we know the acceleration at all times, and the initial velocity, we can in principle compute the velocity at all times. So, one might ask, what's the point of having the concept of velocity? The analogy
is actually fairly close, since if we know the density function, then in principle we know the cumulative distribution function, and part of your question kind of asks what's the point of having the concept of cumulative distribution function.

One answer, for velocity and also for cumulative distribution function, goes as follows. These notions are important, both conceptually and at the practical level.

For a real-valued random variable $X$, let $P_X$ denote the probability distribution of $X$, meaning that $P_X (B) = P(X \in B)$, for any Borel set $B$ of $\mathbb{R}$. Particularly important is the case when $B$ is of the form $B = (-\infty, x]$, $x \in \mathbb{R}$; then $P_X (B) = P(X \in (-\infty,x]) = P(X \leq x)$. The function $F_X$ defined by $F_X (x) = P(X \leq x)$, $x \in \mathbb{R}$, is called the distribution function of $X$. Regarding your last question, it is important to note that the distribution function of $X$ is sometimes defined as the probability of $X < x$, but this is less common.

It is clear from the definitions that any random variable has a probability distribution and a distribution function. A random variable $X$ has a probability density function (pdf) if and only if $X$ is absolutely continuous; $X$ has a probability mass function (pmf) if and only if $X$ is discrete. However, there are random variables which are neither absolutely continuous nor discrete. As a simple example, let $X$ be defined as follows: $X=U$ with probability $1/2$, where $U$ is a uniform$(0,1)$ random variable, and $X=1$ with probability $1/2$. Moreover, there are even continuous random variables which are not absolutely continuous (continuous singular random variables). For $X$ which is neither absolutely continuous nor discrete, the probability distribution and distribution function of $X$ are clearly essential. For example, the expectation of such $X$ can be written in terms of $P_X$ by ${\rm E}(X) = \int_\mathbb{R} {xP_X (dx)}$ and in terms of $F$ by ${\rm E}(X) = \int_{ - \infty }^\infty {x\,dF(x)} $ (provided that the integrals converge; these formulas hold for any random variable $X$). The formula ${\rm E}(X) = \int_{ - \infty }^\infty {x f_X (x)}\,dx$, where $f_X$ is the pdf of $X$, can be applied only to absolutely continuous random variables, and the formula ${\rm E}(X) = \sum\nolimits_k {x_k p_k } $, where $p_k = P(X=x_k)$, can be applied only to discrete random variables.

There are probability distributions that have neither a p.d.f. nor a p.m.f. The easiest example of this would be a random variable that has both discrete and continuous behaviour simultaneously, such as a random variable with the rectified Gaussian distribution. However, every probability distribution on $\mathbb{R}$ does have a c.d.f., and every c.d.f. uniquely determines a probability distribution.

Let's make this precise. First, we should establish some properties common to all cumulative distribution functions. Let $X$ be a real-valued random variable. Its c.d.f. is the function $F : \mathbb{R} \to [0, 1]$ given by
$$F(x) = \mathbb{P}[X \le x]$$
By the axioms for a probability measure, it is immediate that $F$ is an non-decreasing function with
$$\begin{align} \lim_{x \searrow -\infty} F(x) & = 0 & \lim_{x \nearrow +\infty} F(x) & = 1 \end{align}$$
It is also straightforward to show that $\mathbb{P}[a < X \le b] = F(b) - F(a)$. It is clear that $\lim_{b \searrow a} \mathbb{P}[a < X \le b] = 0$ while $\lim_{a \nearrow b} \mathbb{P}[a < X \le b] = \mathbb{P}[X = b]$, so we conclude that
$$\begin{align} \lim_{x \searrow a} F(x) & = F(a) & \lim_{x \nearrow b} F(x) & \text{exists} \end{align}$$
In other words, $F$ is right-continuous with all left-limits. Such a function is called càdlàg (which, despite appearances, is French, and short for continue à droite, limite à gauche).

Proof. For convenience, let us extend $F$ to be a function $\mathbb{R} \cup \{ -\infty \} \to [0, 1]$, with $F(-\infty) = 0$. Then, by the adjoint functor theorem, there is a function $G : [0, 1] \to \mathbb{R} \cup \{ -\infty \}$ which is left adjoint to $F$, that is, for all $z \in [0, 1]$ and $x \in \mathbb{R} \cup \{ -\infty \}$,
$$G(z) \le x \text{ if and only if } z \le F(x)$$
and $G$ is a non-decreasing left-continuous function. [Exercise: Prove this directly without using a sledgehammer.]

Now let $Z$ be a random variable uniformly distributed on $[0, 1]$, and set $X = G(Z)$. This well-defined since $G$ is measurable. We then have
$$\begin{align*} \mathbb{P}[X \le x] & = \mathbb{P}[G(Z) \le x] \\
& = \mathbb{P}[Z \le F(x)] = F(x) \end{align*}$$
In other words, the c.d.f. of $X$ is precisely $F$, as required. (Note that $\mathbb{P}[X = -\infty] = 0$, so $X$ really is a real-valued random variable.)