Statistics - Basic Concepts

Bayes' theorem

This is a fundamental concept in Statistics. It may sound trivial once you understand it. But it is an important concept that has been an important lead in solving several complex problems in various domains including Machine Learning. Hence it is very important to grasp this concept. Bayes' theorem is used to calculate the conditional probability. Conditional probability is the probability of an event 'B' occurring given the related event 'A' has already occurred. Consider for example, a clinic wants to detect cancer among the patients. In formal statistical terms, let A represent an event "Person has cancer" and let B represent an event "Person is a smoker". The clinic wishes to calculate the proportion of smokers from the ones diagnosed with cancer. To do so use the Bayes' Theorem (also known as Bayes' rule) which is as follows:

P(A|B) = P(B|A) * P(A) / P(B)

That is: Probability of A given B is the probability of B given A multiplied by the ratio of Probability of A and Probability of B.

In our case, we can use this to find the probability that a smoker gets Cancer. Based on the data available in the clinic, one can identify the probability that a patient has cancer (A) and the probability that the patient has cancer(B). From the data, we also know P(B|A) - that is the probability that a cancer patient is a smoker. Based on this data, we can use the Bayes's theorem to get the probability that a smoker develops cancer - that is P(A|B)

This is the essence of Bayes' Theorem. As mentioned, it sounds too trivial once it is understood. But this principle is the basis for many other derivations in statistics - Machine Learning and Data Analytics. Hence it is very important to understand its essence.

If you are interested in a more detailed analysis of the subject, you can check out this Link

Binomial Distribution

This is a kind of distribution of probabilities for experiments having defined number of trials. It is relevant only in case of discrete random variables (random variables that can have a two discrete values - not a continuous range)

All the above should be satisfied for a distribution to qualify as binomial.

For example, the probability of getting heads on the toss of a coin in 20 attempts. This meets all the above criteria. The number of attempts is defined. There are only two outcomes possible - success or failure. And each toss is independent. Succeeding or failing once does not affect the next attempt in any way.