Lecture: The Bootstrap (Advanced Data Analysis from an Elementary Point of View)

Three-Toed Sloth 2013-03-15

Summary:

The sampling distribution is the source of all knowledge regarding statistical uncertainty. Unfortunately, the true sampling distribution is inaccessible, since it is a function of exactly the quantities we are trying to infer. One exit from this vicious circle is the bootstrap principle: approximate the true sampling distribution by simulating from a good model of the process, and treating the simulation data just like the data. The simplest form of this is parametric bootstrapping, i.e., simulating from the fitted model. Nonparametric bootstrapping means simulating by re-sampling, i.e., by treating the observed sample as a complete population and drawing new samples from it. Bootstrapped standard errors, biases, confidence intervals, p-values. Tricks for making the simulated distribution closer to the true sampling distribution (pivotal intervals, studentized intervals, the double bootstrap). Bootstrapping regression models: by parametric bootstrapping; by resampling residuals; by resampling cases. Many, many examples. When does the bootstrap fail?

Note: Thanks to Prof. Christopher Genovese for delivering this lecture while I was enjoying the hospitality of the fen-folk.

Reading: Notes, chapter 6 (R for figures and examples; pareto.R; wealth.dat); Lecture slides; R for in-class examples Cox and Donnelly, chapter 8

Advanced Data Analysis from an Elementary Point of View

Link:

http://bactra.org/weblog/1001.html

From feeds:

Statistics and Visualization ยป Three-Toed Sloth

Tags:

Date tagged:

03/15/2013, 12:54

Date published:

03/15/2013, 12:54