Simple explanation of p-value

Qingchuan Lyu
Oct 23, 2020

Just wanted to explain p-value in a very simple way.

Idea: p-value is the smallest probability at which you would reject the null hypothesis after seeing data. You reject the null hypothesis if and only if p-value is smaller than a pre-determined probability (usually 0.05).

Example: you want to test if the average height of 550 boys at your elementary school is not less than 5'. In this case, you have null and alternative hypothesis:

H_o: h = 5'; H_1: h>5'

Then, you collect data and get 6.5'’ as the average height of boys (pretty tall!). But does it mean you can reject the null hypothesis? You draw a t distribution with average 5 and degree of freedom 550–1=549:

t-distribution with avg=5 and var=1, df=449

In this case, you check t-table and find that the shaded area is 0.15. That means your p-value is 0.15! This is much greater than your pre-determined 0.05 threshold. By taking another look, you figure out that the probability 0.05 corresponds to a much more extreme value on the x-axis. Therefore, your sample mean 6'5'’ isn’t tall enough to let you reject the null hypothesis of mean=5'!

To understand where does 0.05 comes from and why it is pre-determined, please watch out my next posts!

--

--

Qingchuan Lyu

Data Engineering, Causal Inference & Predictive Analysis