Hypothesis Testing


Confidence intervals are used to estimate parameters with a margin of error, hypothesis tests are used to estimate parameters and to evaluate evidence regarding a prior idea about those parameters.

 

Hypotheses     Statements about population parameters. They must be specified before the data is collected (they are the reason the data is being collected). Hypotheses come in pairs...


Null hypothesis vs.


-“no effect” or “no difference”, i.e. the difference that is observed is due to random error.

- usually the “current state” and the is the statement the researcher is trying to disprove.




- look for evidence against the null hypothesis, it is “innocent until proven guilty”


-denoted H0

-always has an = sign



Alternative hypothesis


- the observed difference is “real”, i.e. it is larger than we would expect to see due to random error

- usually the main one in the sense that it is what the researcher wants to show (quality control is the exception where it is what the researcher wants to test, but hopes is not true)


- can not prove with 100% certainty that it is true, but when we find evidence against the null this supports the alternative


- denoted Ha

- can be “one-sided” with a < or > than sign

or “two-sided” with a ≠


Test statistic   The test statistic is the sample statistic (i.e. the sample mean or sample percentage) converted to standard units. We are able to do this because the central limit theorem tells us that the sample statistics are normally distributed around the true parameter.

 

The test statistic is found assuming the null hypothesis is true, i.e. assuming the value in the null hypothesis is the true parameter.

 

P-value     The p-value is probability of seeing the sample data or more extreme if the null hypothesis is true. The p-value is calculated using the test statistic.


 

Significance level       This is the point at which the researcher thinks the evidence against the null is convincing, i.e. the point at which the p-value is “small”. It is denoted α. In traditional statistics the researcher chooses this level in advance of seeing the data.


Hypothesis Testing Steps


1. State the null and alternative hypotheses.

2. Specify the significance level (this is done by the researcher, in a word problem it will be given).


*** Gather the data **** 


3. Calculate the test statistic.

4. Find the P-value.

5. Make a decision about the null hypothesis and a conclusion about the alternative hypothesis.

 

If the p-value is small, i.e. less than (or equal to) α, then reject the null and conclude there is evidence in favor of the alternative.

 

If the p-value is large, i.e. greater than α, then do not reject the null and conclude there is not evidence in favor of the alternative.


Note: if the probability of seeing the data if the null hypothesis is true is small, this makes us believe that the alternative is more likely. However, we cannot measure this likelihood because the alternative covers a range of values. The likelihood changes at each one of these values.




P-value (i.e. probability of seeing the data if the null is true)

Decision

Conclusion

Smaller than significance level

Reject the null

We have evidence in favor of the alternative.

Larger than significance level

Do not reject the null

We do not have evidence in favor of the alternative.




More on choosing a significance level


Two types of error:


 

H0 is true

H0 is false

Reject H0

Type I error

Correct decision

Do not Reject H0

Correct decision

Type II error

 

So,

                  Type I error is Reject H0 when it is true.

                  Type II error is to not reject H0 when it is false.


Type I error is the significance level, the researcher chooses this in advance.

Type II error depends on Type I error. As Type I error decreases, Type II increases and visa versa.


Two ways to reduce Type II error:


1. Increase the significance level which is the Type I error.

2. Increase the sample size. As the sample size increases the estimates become more accurate and probability of error decreases. Assuming the Type I error stays the same, this increase in accuracy results in a decrease in Type II error.


To calculate Type II error, you need a specific alternative in mind.

            First, you need to find the data value that corresponds to the significance level cutoff, then find the probability of seeing this data value if the specific alternative you have in mind is true.