Means and medians

Andy Wills

Median

data
[1]  1  2  2  3  3  3  4  4 50
  • The median is the middle number when the data is put in order
median(data)
[1] 3

Mean

data
[1]  1  2  2  3  3  3  4  4 50
  • The mean is the sum of all the numbers (72), divided by the sample size (9), giving:
mean(data)
[1] 8

Comparison

data
[1]  1  2  2  3  3  3  4  4 50
  • In this case, the mean (8) is bigger than nearly all the numbers. This means it's not very representative of its sample.

  • The mean will always be unrepresentative when there are a few numbers that are very high (or very low), compared to the rest.

  • In these cases, we sau the distibution is skewed.

Skewness

plot of chunk unnamed-chunk-7

  • Our income data is skewed, so the median gives a better indication of average salary than the mean in this case.