Sampling Distribution of the Mean
The sampling distribution of the mean refers to taking many, many samples and finding the mean from each sample and then looking at the distribution of the many, many sample means - not the population distribution.
Unrealistic case
• Know the true populations parameters (population mean or population proportion)
• Take many, many samples
• The question is: Where do we think the statistics (the sample mean or sample proportion) will fall relative to the population parameters?
Realistic case
• Do not know the parameter
• Take one sample
• The question is: Where do we think the population parameters are relative to the statistics?
The unrealistic case provides the theory for the realistic case.
THE MOST IMPORTANT OF ALL STATISTICAL THEOREMS
The Central Limit Theorem:
No matter what the population distribution looks like, as the sample size increases the distribution of the sample mean becomes closer to a normal distribution.
Implication: If n is large we can use the normal distribution to find the probabilities of the sample mean falling in certain intervals.
When is n large?
It depends on how close the population distribution itself is from the normal distribution.
If the population distribution is normal, you can use the normal distribution for the sample mean for any sample size, including n=1.
If the population is not normal it depends on how skewed it is, the more skewed it is the larger n must be.
Rule of thumb: n=30 is large.
Bottom line: Sample means are normally distributed with mean μ and standard deviation
.
is the standard deviation of the sample means, it is also called the standard error.