Introduction to hypothesis testing

With the confidence interval, we are able, for the first time, to say something about the population based on information in our sample. We use the confidence interval to estimate the population mean.

Often, we have ideas about the population mean before we collect data on our sample. We may have read other studies on the same topic, and they suggest that they population mean will be a particular value. Or, we may have an idea about the population mean based on the internal logic of the measurements we are using. Sometimes, we base our expectations on our experience of the particular social phenomena we are studying.

We need to formalize this testing of our expectations against our sample data. This process is called hypothesis testing. Each new statistic that we learn will be a tool for testing a different kind of hypothesis.

The Logic of Hypothesis Testing

At first, our approach to hypothesis testing will seem backwards. We will set up a certain hypothesis — the null hypothesis — and try to demonstrate that it is probably wrong, based on our sample data.

Why not just try to prove the hypothesis based on our expectations?

As it turns out, this is a difficult task. It is easier to use probability to show that the null hypothesis is probably wrong.

The null hypothesis always states that there is no effect. In contrast, the research hypothesis states that there is an effect — that is, our expectations about the population mean, for example.

The null and research hypotheses are always defined as logical opposites. They are mutually exclusive of one another — only one or the other can be true, not both.

Let’s consider a specific problem.

Say that we are interested in global social indicators and we compute the mean labor force participation for men. Knowing that inequality is gendered, we expect that labor force participation for women will be lower. (In explaining the hypothesis, we could cite literature which explains gender socializations regarding public life and household labor; men are more likely to work outside the home for wages but women do much more domestic labor, which is usually uncompensated.)

In the language of hypothesis testing, we start with the null hypothesis (H0) which always states no effect. In this case, that the mean labor force participation for women is not different than the computed value for men. The research hypothesis (H1) is that there is a difference — that is, that the mean for women is not equal to the mean for men.

Statement of null and research hypotheses.

We can calculate a t-score and test these hypotheses. The data from our sample will lend support to one or the other. By convention, we interpret our data from the point of view of the null hypothesis — that is, our data either supports or fails to support the null hypothesis.

Let’s think, for a moment, about the logic of hypothesis testing. When we make a decision about the null hypothesis, our decision is either correct or incorrect. If we were able to know reality directly, we could determine if the hypothesis is actually true or false.

Figure 1. The Logic of Inference
Table showing the kinds of errors we can make in hypothesis testing.
From Levin and Fox, Elementary Statistics in Social Research, 7th edition, 1997.

When we do hypothesis testing, we try to balance the risk of a type I error with the desire to correctly discover a real effect. Science, as a social practice, is conservative in this regard. We tend to favor a rather strict criterion — typically 95% certainty. Thus, we are relatively more likely to miss real effects than to mistakenly claim that there is an effect when, in fact, there isn’t.

The steps of hypothesis testing can be summarized:
A) State the research and null hypotheses;
B) set an alpha level (i.e., a level of confidence in the claim of an effect; this is almost always 0.05);
C) compute the appropriate significance test; and,
D) interpret the results.

Let’s see what this looks like in a Python notebook: hypothesis testing.

Author: Timothy Shortell, Ph.D.

Timothy Shortell, Ph.D., Professor & Chair, Department of Sociology, Brooklyn College CUNY