Is it time to up the statistical standard for scientific results?

Ars Technica » Scientific Method 2013-11-12

If you believe The Economist, science is in the midst of a crisis with most of its conclusions failing to stand the test of time. Research fraud is rising, but even studies that were performed properly sometimes can't be reproduced or appear to suffer from bias.

A new analysis suggests a very simple explanation for some of the problems: our statistics are weak. A statistician has figured out how to compare Bayesian statistics to those normally used in scientific tests of significance. By comparing the two, he finds that researchers are often accepting numbers that any good Bayesian would consider to be weak evidence.

What's in a p?

To understand the problem, we have to go into how scientists assess significance. Typically, a given experiment has an experimental condition, which produces a number, and a control condition, which produces a second. The two numbers will typically be different, but we need to know if those differences are significant. That's where statistics comes in. The typical test used in science involves determining whether you'd produce the two numbers by random chance. In most fields, if there's less than a five percent chance that you'd get the two numbers by random chance, then you can reject chance—the results are considered significant. In statistical terms, this is called having a p value of less than 0.05.

Read 15 remaining paragraphs | Comments