Probability distributions are important in machine learning and data analysis. Not only data scientists but also also researchers and scientists from many other fields deal with probability distributions on a day to day basis. To put it simply, probability distribution is a simply a function which informs us of the likelihood of obtaining the possible values that a random variable can take.

For example, you are walking in your lane where you stay. You are recording the heights of all the building as you go along. Now, what you are doing is actually taking random samples and creating a probability distribution and this can be very useful going forward. They will tell us about which heights are more likely and what is the variance between heights and many other things. To this end, probability distribution can be discrete or continuous.

To simplify, one could think of discrete probability distributions taking strictly discrete number of values. And continuous probability distributions take continuous values. However, physicists, mathematicians, engineers favour a special type of probability distribution, widely known as Gaussian Distribution. The distribution is a continuous Gaussian distribution and it surfaces in our day-to-day life and in nature as well. The other name for the Gaussian distribution, is Normal distribution. It is named so because this particular distribution occurs everywhere and every other distribution is abnormal.

**The Gaussian Distribution**

The normal distribution, is known to many as the bell curve also. The Gaussian distribution is a two-parameter family of curves. It is represented by:

Here μ is is the mean and *σ*2 is known as the variance. The parameter µ determines the location of the distribution while σ determines the width of the bell curve. The normal distribution with mean 0 and standard deviation 1, is called the standard normal distribution. Also it is to be noted that the random variable with standard normal distribution is called a standard normal random variable. It is denoted by Z.

**The Central Limit Theorem**

Technically speaking, The Central Limit Theorem states that the sampling distribution of the sample means approaches a normal distribution as the sample size gets larger. This is an astonishing result and is very counter intuitive. This result is true and is not dependent on the shape of the population distribution. It is more and more prominent with sample size 30 or more than 30. Hence when we extract more and more samples from the population and take the sample means, it looks more and more like a normal distribution. This sudden show up of Gaussian (Normal) distribution makes it very special and gives rise to many phenomena.

Let us look at an application of the Central Limit Theorem. Suppose a man decides to travel through the desert and runs out of fuel in his car. He calls for some help and dials the emergency number to contact government services. He happens to be at the edge of the cell range and his voice is noisy and cant be heard clearly by executives trying to help him on the other end of the call. It would be great if the executive could clean up the noise using signals from some nearby 100 odd towers.

The signals can be denoted by: X1, …, X100, where Xi = S + Y

Here S = true signal being sent to the towers

And Y = noise in the signal.

Here we can assume that noises Y1, …, Y100 are independent and identically distributed. We can suppose the mean of the noise is 0 and the variance is σ2 . We also assume that the noise has a normal distribution. The executive can simply clean up the signal by applying the simple averaging formula

X = ( X1 + · · · + X100 / 100) = S + ( Y1+ · · · + Y100 / 100 )

Now we know that using the Central Limit Theorem,

( Y1 + · · · + Y100 / 100 ) is approximately N(0, σ2 / 100) (Gaussian Distribution)

Hence by understanding the nature of the noise we can reduce the noise considerably.

**Gaussian Distributions Can Be Used To Solve Common Problems **

As mentioned earlier, scientists in many fields use Gaussians distributions to solve commonly occuring scientific problems. Physicists use Gaussians to maximise entropy for a given energy which can be any kind of energy. Hence the Gaussian distribution governs the probability of a given particle in a bottle of gas at a certain temperature.

There are many operations on Gaussians that give interesting results. For example the following:

- Fourier transform of Gaussian is a Gaussian
- Sum of two independent Gaussian random variables is Gaussian
- Convolution of Gaussian with another Gaussian is a Gaussian
- Product of two Gaussian is a Gaussian

In another application, in Fourier analysis the Gaussian or normal distribution is one of the eigenvectors of the Fourier Transform which means the frequency components of a Fourier Transform is represented by a normal distribution. It is widely known that the blood pressure patterns of adult humans also follow the Gaussian distribution.

Try deep learning using MATLAB