Teach Time Encyclopedia - Learn About Our World
Home Page
Teach Time
Featured Topics

United States
by state

CITYology

Academic Disciplines

Historical Timelines

Themed Timelines

Calendars

Reference Tables

Biographies

How-tos



Wednesday, December 03, 2008

Testing hypotheses suggested by the data

In statistics, hypotheses suggested by the data must be tested differently from hypotheses formed independently of the data.

How to do it wrong

For example, suppose 50 different researchers, unaware of each other's work, run clinical trials to test whether Vitamin X is efficacious in preventing cancer. Forty-nine of them find no significant differences between measurements done on patients who have taken Vitamin X and those who have taken a placebo. The 50th study finds a difference so extreme that if Vitamin X has no effect then such an extreme difference would be observed in only one study out of 50. When all fifty studies are pooled, one would say no effect of Vitamin X was found. But it would be reasonable for the investigators running the 50th study to consider it likely that they have found an effect, until they learn of the other 49 studies. Now suppose that one study was in Denmark. The data suggest a hypothesis that Vitamin X is more efficacious in Denmark than elsewhere. But Denmark was fortuitously the one-in-fifty in which an extreme value of a test statistic happened; one expects such extreme cases one time in 50 on average if no effect is present. It would therefore be fallacious to cite the data as evidence in favor of this particular hypothesis suggested by the data.

The general problem

Such a process greatly inflates the probability of type I error as all but the data most favourable to the hypothesis is discarded. This is a risk, not only in hypothesis testing but in all statistical inference as it is often problematic accurately to describe the process that has been followed in searching and discarding data. It is a particular problem in statistical modelling, where many different models are rejected by trial and error before publishing a result (see also overfitting.) Likelihood and bayesian approaches are no less at risk owing to the difficulty in specifying the likelihood function without an exact description of the search and discard process.

The error is a particularly prevalent in data mining and machine learning. It also commonly occurs in academic publishing where only reports of positive, rather than negative, results tend to be accepted, resulting in the effect known as publication bias.

How to do it right

Strategies to avoid the problem include:

Henry Scheffé's simultaneous test of all contrasts in multiple comparisons problems is the most well-known remedy in the case of analysis of variance. It is a method designed for testing hypotheses suggested by the data while avoiding the fallacy described above. See his A Method for Judging All Contrasts in the Analysis of Variance, Biometrika, 40, pages 87-104.


Internet Hotel Solutions

Site Sponsors
AC Units
Baltimore Harbor
Boot Camp Grads
Bra Size
Burkittsville
College Hotels
Digital Harbor
Free Cell Phones
Golden Hare Travel
Golf Vacations
Golf Courses
Gourmet
Hair Styles
Hippodrome
iWoman
Lesson Plans
Maryland Hotels
MD Genealogy
Minor League Stuff
Motel Site
Ocean City
OC Real Estate
Old Agers
Office Supplies
Orlando
Pet Friendly Hotel
Room Prices
Savannah, GA
Ski Vacations
South Baltimore
Student Teaching
Travel Sources
University Hotels
Visit Military Bases
Washington, DC

Brought to you by NoChildLeftBehind.com and the Beaches and Towns Network, LLC.