Decision making and storytelling are two important facets of a data scientist’s job description. Models can be tweaked and computational powers can be pumped up. But to choose a certain test or a method, will have great implications on the product lifecycle. From cost-cutting to life-saving, hypothesis testing is prevalent in the world of statistics and with the conception of statistical machine learning, the tests have been imbibed and are made more accessible with the Python’s ever increasing and improving, task-specific libraries.

Statistical tests are commonly classified as parametric and non-parametric tests. Parametric tests are conducted, with an assumption that the data follows a Gaussian distribution. If this assumption fails, then non-parametric tests are considered for hypothesis testing.

Here we list few widely used statistical tests(parametric and non-parametric) available in Python:

**Chi-Squared Test**

Chi-squared test is a well-known test even for those who are starting with statistical machine learning. Here, this test is used to check whether two categorical variables are related or independent. And, it is assumed that the observations used in the calculation of the contingency table are independent.

Python Code

`from scipy.stats import chi2_contingency`

table = ...

stat, p, dof, expected = chi2_contingency(table)

### Student’s t-test

Tests whether the means of two independent samples are significantly different.

Observations in each sample are independent and identically distributed (iid). Observations in each sample are normally distributed. Observations in each sample have the same variance.

Python Code

`from scipy.stats import ttest_ind`

data1, data2 = ...

stat, p = ttest_ind(data1, data2)

### Analysis of Variance Test (ANOVA)

ANOVA is another widely popular test which is used to test how independent two samples are of each other. Here the observations are assumed to follow a normal distribution without any change in the variance.

Python Code

`from scipy.stats import f_oneway`

data1, data2, ... = ...

stat, p = f_oneway(data1, data2, ...)

### Shapiro-Wilk Test

This test is used to check whether the sample data has a Gaussian distribution.

Python Code

`from scipy.stats import shapiro`

data1 = ....

stat, p = shapiro(data)

### D’Agostino’s K^2 Test

Similar to Shapiro-Wilk test, this too is used to check for Gaussian distribution in data samples.

Python Code

`from scipy.stats import normaltest`

data1 = ....

stat, p = normaltest(data)

### Pearson’s Correlation Coefficient

A statistical test for checking correlation between two samples and whether they have a linear relationship.

Python Code

`from scipy.stats import pearsonr`

data1, data2 = ...

corr, p = pearsonr(data1, data2)

### Spearman’s Rank Correlation

Observations in each sample are assumed that they can be ranked, for checking whether the relationship is monotonic or not.

Python Code

`from scipy.stats import spearmanr`

data1, data2 = ...

corr, p = spearmanr(data1, data2)

Mann-Whitney U Test

A non-parametric statistical hypothesis test to check for independent samples and to find whether the distributions are equal or not.

Python Code

`from scipy.stats import mannwhitneyu`

data1, data2 = ...

stat, p = mannwhitneyu(data1, data2)

### Kruskal-Wallis H Test

Like previous tests, Kruskal-Wallis hypothesis test also makes the same assumptions regarding the distribution and ranking of the observations in each sample. And, the test is carried to check for the independence of the observations from each other.

Python Code

`from scipy.stats import kruskal`

data1, data2, ... = ...

stat, p = kruskal(data1, data2, ...)

### Friedman Test

Friedman test checks whether the distributions of two or more paired samples are equal or not.

Python Code

`from scipy.stats import friedmanchisquare`

data1, data2, ... = ...

stat, p = friedmanchisquare(data1, data2, ...)

### Conclusion

The probability of rejecting the null hypothesis is a function of five factors: whether the test is one- or two-tailed, the level of significance, the standard deviation, the amount of deviation from the null hypothesis, and the number of observations. Having said that, statistical tests are also subject to criticism. For instance, while interpreting the p-value, the way multiple comparisons are done is tricky because p-values depend on both data observed and data that might have been observed but wasn’t. Therefore, a statistician or an analyst or a data scientist should be aware of the fact that statistical significance does not imply practical significance and correlation doesn’t imply causation. Every test is only a means to an end which is, often vague.