Lecture: Splines (Advanced Data Analysis from an Elementary Point of View)

Three-Toed Sloth 2013-03-15

Summary:

Kernel regression controls the amount of smoothing indirectly by bandwidth; why not control the irregularity of the smoothed curve directly? The spline smoothing problem is a penalized least squares problem: minimize mean squared error, plus a penalty term proportional to average curvature of the function over space. The solution is always a continuous piecewise cubic polynomial, with continuous first and second derivatives. Altering the strength of the penalty moves along a bias-variance trade-off, from pure OLS at one extreme to pure interpolation at the other; changing the strength of the penalty is equivalent to minimizing the mean squared error under a constraint on the average curvature. To ensure consistency, the penalty/constraint should weaken as the data grows; the appropriate size is selected by cross-validation. An example with the data, including confidence bands. Writing splines as basis functions, and fitting as least squares on transformations of the data, plus a regularization term. A brief look at splines in multiple dimensions. Splines versus kernel regression.

Reading: Notes, chapter 8

Optional reading: Faraway, section 11.2.

Advanced Data Analysis from an Elementary Point of View

Link:

http://bactra.org/weblog/1005.html

From feeds:

Statistics and Visualization ยป Three-Toed Sloth

Tags:

Date tagged:

03/15/2013, 12:54

Date published:

03/15/2013, 12:54