The Normal Curve – SOCY 2112. Research Methods II

The normal curve is a tool for understanding how probability helps us make inferences from sample statistics to population parameters. We could build a line graph to reflect the relative frequencies of some set of outcomes. This would show us the empirical probabilities in the distribution. We can extend this notion by constructing a frequency distribution based on theoretical probabilities instead of empirical results.

For example, if we have ten marbles, 6 are black, 3 are red and one green, we can build a corresponding probability distribution. There is a 60% chance that a marble selected at random would be black. There is a 30% chance that a marble selected at random would be red and a 10% chance that the marble would be green.

Now, imagine that we have four coins. We can build a probability distribution that reflects the likelihood of generating the possible number of tails.

Probability distribution

Probability distributions can be viewed as empirical frequency distributions for an infinite number of cases. Practically speaking, this means that empirical frequency distributions for very large data files will tend to approximate the theoretical distribution more than frequency distributions for small data files.

The Normal Curve
One very important probability distribution is the normal curve, sometimes called the bell-shaped curve. It plays a central role in the statistical decision making process.

The normal curve with the population mean (mu) and population standard deviation (sigma) illustrated.

The normal curve has a number of important properties.
A. It is symmetrical;
B. it is unimodal;
C. and, the area under the curve represents proportion, or probability.

Since the area under the curve represents proportion, we can calculate the percent of cases to be expected between some given point and the mean, for example.

We can, in effect, mark off the proportions on a scale of standard deviations. We can see the percent of cases between the mean and one standard deviation above the mean.

The normal curve with one standard deviation above the mean illustrated.

And likewise, for two standard deviations above the mean.

The normal curve with two standard devations above the mean illustrated.

Three standard deviations above is just about 50%.

The normal curve with three standard deviations above the mean illustrated.

Since the curve is symmetrical, we know that there are about 6 standard deviations under the curve–three above and three below. (Remember our use of R/6 to judge the relative size of the standard deviation? This is the explanation for the denominator.)

The proportions can be expressed as probabilities. The area under the curve represents the probability of drawing a score at random from the distribution at some point or below, for example.

The normal curve illustrating the probability of selecting a case one standard devation above the mean or less.

Author: Timothy Shortell, Ph.D.

Timothy Shortell, Ph.D., Professor & Chair, Department of Sociology, Brooklyn College CUNY View all posts by Timothy Shortell, Ph.D.