Books to Read While the Algae Grow in Your Fur, December 2018

Three-Toed Sloth 2019-08-20

Summary:

Attention conservation notice: I have no taste. I also have no qualifications to discuss poetry or leftist political theory. I do know something about spatiotemporal data analysis, but you don't care about that.

Gidon Eshel, Spatiotemporal Data Analysis
I assigned this as a textbook in my fall class on data over space and time, because I need something which covered spatiotemporal data analysis, especially principal components analysis, for students who could be taking linear regression at the same time, and was cheap. This met all my requirements.
The book is divided into two parts. Part I is a review or crash course in linear algebra, building up to decomposing square matrices in terms of their eigenvalues and eigenvectors, and then the singular value decomposition of arbitrary matrices. (Some prior acquaintance with linear algebra will help, but not very much is needed.) Part II is about data analysis, covering some basic notions of time series and autocorrelation, linear regression models estimated by least squares, and "empirical orthogonal functions", i.e., principal components analysis, i.e., eigendecomposition of covariance or correlation matrices. As for "cheap", while the list price is (currently) an outrageous \$105, it's on JSTOR, so The Kids had free access to the PDF through the university library.
In retrospect, there were strengths to the book, and some serious weaknesses --- some absolute, some just for my needs.
The most important strength is that Eshel writes like a human being, and not a bloodless textbook. His authorial persona is not (thankfully) much like mine, but it's a likeable and enthusiastic one. This is related to his trying really, really hard to explain everything as simply as possible, and with multitudes of very detailed worked examples. I will probably be assigning Part I of the book, on linear algebra, as refresher material to my undergrads for years.
He is also very good at constantly returning to physical insight to motivate data-analytic procedures. (The highlight of this, for me, was section 9.7 [pp. 185ff] on when and why an autonomous, linear, discrete-time AR(1) or VAR(1) model will arise from a forced, nonlinear, continuous-time dynamical system.) If this had existed when I was a physics undergrad, or starting grad school, I'd have loved it.
Turning to the weaknesses, some of them are, as I said, merely ways in which he didn't write the book to meet my needs. His implied reader is very familiar with physics, and not just the formal, mathematical parts but also the culture (e.g., the delight in complicated compound units of measurement, saying "ensemble" when other disciplines say "distribution" or "population"). In fact, the implied reader is familiar with, or at least learning, climatology. But that reader has basically no experience with statistics, and only a little probability (so that, e.g., they're not familiar with rules for algebra with expectations and covariances*). Since my audience was undergraduate and masters-level statistics students, most of whom had only the haziest memories of high school physics, this was a mis-match.
Others weaknesses are, to my mind, a bit more serious, because they reflect more on the intrinsic content.
  • A trivial but real one: the book is printed in black and white, but many figures are (judging by the text) intended to be in color, and are scarcely comprehensible without it. (The first place this really struck me was p. 141 and Figure 9.4, but there were lots of others.) The electronic version is no better.
  • The climax of the book (chapter 11) is principal components analysis. This is really, truly important, so it deserves a lot of treatment. But it's not a very satisfying stopping place: what do you do with the principal components once you have them? What about the difference between principal components / empirical orthogonal functions and factor models? (In the book's terms, the former does a low-rank approximation to the sample covariance matrix $\mathbf{v} \approx \mathbf{w}^T \mathbf{w}$, while the latter treats it as low-rank-plus-diagonal-noise $\mathbf{v} \approx \mathbf{w}^T\mathbf{w} + \mathbf{d}$, an importantly different thing.) What about nonlinear methods of dimensionality reduction? My issue isn't so much that the book didn't do everything, as that it didn't give readers even hints of where to look.
  • There are places where the book's exposition is not very internally coherent. Chapter 8, on autocorrelation, introduces the topic with an example where $x(t) = s(t) + \epsilo

Link:

http://bactra.org/weblog/algae-2018-12.html

From feeds:

Statistics and Visualization ยป Three-Toed Sloth

Tags: