Teach Time Encyclopedia - Learn About Our World
Home Page
Teach Time
Featured Topics

United States
by state

CITYology

Academic Disciplines

Historical Timelines

Themed Timelines

Calendars

Reference Tables

Biographies

How-tos



Friday, July 25, 2008

Regression toward the mean

In statistics, regression toward the mean is a principle stating that of related measurements, the second is expected to be closer to the mean than the first.

Table of contents
1 Example
2 History
3 Ubiquity
4 Mathematical derivation
5 Regression fallacies

Example

Consider for example students who take a midterm and a final exam. Students who got an extremely high score on the midterm will probably get a good score on the final exam as well, but we expect their score to be closer to the average (i.e.: fewer standard deviations above the average) than their midterm score was. The reason: it is likely that some luck was involved in getting the exceptional midterm score, and this luck cannot be counted on for the final. It is also true that among those who get exceptionally high final exam scores, the average midterm score will be fewer standard deviations above average than the final exam score, since some of those got high scores on the final due to luck that they didn't have on the midterm. Similarly, unusually low scores regress toward the mean.

History

The principle was described and explained by Francis Galton in the 1870s and 1880s. Initially, he investigated geniuses in various fields and noted that their children, while typically gifted, were almost invariably closer to the average than their exceptional parents. He later described the same effect more numerically by comparing fathers' heights to their sons' heights. Again, the son's height is typically closer to the mean height than the father's height.

Ubiquity

It is important to realize that regression toward the mean is a ubiquitous statistical phenomenon and has nothing to do with biological inheritance. It is also unrelated to the progression of time: the fathers of exceptionally tall people also tend to be closer to the mean then their sons. The overall variability of height among fathers and sons is the same.

Mathematical derivation

If one assumes that the two variables X and Y follow a bivariate normal distribution with mean 0, common variance 1 and correlation coefficient r, then the expected value of Y given that the value of X was measured to be x is equal to rx, which is closer to the mean 0 than x since |r| < 1. If the variances of the two variable X and Y are different, and one measures the variables in "normalized units" of standard deviations, then the principle of regression toward the mean also holds true.

This example illustrates a general fact: regression toward the mean is the more pronounced the less the two variables are correlated, i.e. the smaller |r| is.

Regression fallacies

Misunderstandings of the principle (known as "regression fallacies") have repeatedly led to mistaken claims in the scientific literature.

An extreme example is Horace Secrist's 1933 book The Triumph of Mediocrity in Business, in which the statistics professor collected mountains of data to prove that the profit rates of competitive businesses tend towards the average over time. In fact, there is no such effect; the variability of profit rates is almost constant over time. Secrist had only described the common regression toward the mean. One exasperated reviewer likened the book to "proving the multiplication table by arranging elephants in rows and columns, and then doing the same for numerous other kinds of animals".

A different regression fallacy occurs in the following example. We want to test whether a certain stress-reducing drug increases reading skills of poor readers. Pupils are given a reading test. The lowest 10% scorers are then given the drug, and tested again, with a different test that also measures reading skill. We find that the average reading score of our group has improved significantly. This however does not show anything about the effectiveness of the drug: even without the drug, the principle of regression toward the mean would have predicted the same outcome.

External Links

  • How Regression Got Its Name, chapter from the online book "Statistics and Data Analysis" by A. Abebe, J. Daniels, J. W. McKean and J. A. Kapenga

References

  • S. M. Stigler: Statistics on the Table, Harvard University Press 1999, chapter 9.


Internet Hotel Solutions

Site Sponsors
AC Units
Baltimore Harbor
Boot Camp Grads
Bra Size
Burkittsville
College Hotels
Digital Harbor
Free Cell Phones
Golden Hare Travel
Golf Vacations
Golf Courses
Gourmet
Hair Styles
Hippodrome
iWoman
Lesson Plans
Maryland Hotels
MD Genealogy
Minor League Stuff
Motel Site
Ocean City
OC Real Estate
Old Agers
Office Supplies
Orlando
Pet Friendly Hotel
Room Prices
Savannah, GA
Ski Vacations
South Baltimore
Student Teaching
Travel Sources
University Hotels
Visit Military Bases
Washington, DC

Brought to you by NoChildLeftBehind.com and the Beaches and Towns Network, LLC.