Books to Read While the Algae Grow in Your Fur, September 2024
Three-Toed Sloth 2024-10-02
Summary:
Attention
conservation notice: I have no taste, and no qualifications to opine
on world history,
or even
on random
matrix theory. Also, most of my reading this month was done
at odd hours and/or while bottle-feeding a baby, so I'm less reliable and more
cranky than usual.
- Marc Potters and Jean-Philippe Bouchaud, A First Course in Random Matrix Theory: for Physicists, Engineers and Data Scientists, doi:10.1017/9781108768900
- I learned of random matrix theory in graduate school; because of my weird path, it was from May's Stability and Complexity in Model Ecosystems, which I read in 1995--1996. (I never studied nuclear physics and so didn't encounter Wigner's ideas about random Hamiltonians.) In the ensuing nearly-thirty-years, I've been more or less aware that it exists as a subject, providing opaquely-named results about the distributions of eigenvectors of matrices randomly sampled from various distributions. It has, however, become clear to me that it's relevant to multiple projects I want to pursue, and since I don't have one student working on all of them, I decided to buckle down and learn some math. Fortunately, nowadays this means downloading a pile of textbooks; this is the first of my pile which I've finished.
- The thing I feel most confident in saying about the book, given my confessed newbie-ness, is that Potters and Bouchaud are not kidding about their subtitle. This is very, very much physicists' math, which is to say the kind of thing mathematicians call "heuristic" when they're feeling magnanimous *. I am still OK with this, despite years of using and teaching probability theory at a rather different level of rigor/finickiness, but I can imagine heads exploding if those with the wrong background tried to learn from this book. (To be clear, I think more larval statisticians should learn to do physicists' math, because it is really good heuristically.)
- To say just a little about the content, the main tool in here is the "Stieljtes transform", which for an $N\times N$ matrix $\mathbf{A}$ with eigenvalues $\lambda_1, \ldots \lambda_N$ is a complex-valued function, \[ g^{A}_N(z) = \frac{1}{N}\sum_{i=1}^{N}{\frac{1}{z-\lambda_i}} \] This can actually be seen as a moment-generating function, where the $k^{\mathrm{th}}$ "moment" is the trace of $\mathbf{A^k}$ over $N$. (Somewhat unusually for a moment generating function, the dummy variable is $1/z$, not $z$, and one takes the limit of $|z| \rightarrow \infty$ instead of $\rightarrow 0$.)
- The hopes are that (i) $g_N$ will converge to a limiting function as $N\rightarrow\infty$, \[ g(z) = \int{\frac{\rho(d\lambda)}{z-\lambda}} \] and (ii) the limiting distribution $\rho$ of eigenvalues can be extracted from $g(z)$. The second hope is actually less problematic mathematically **. Hope (i), the existence of a limiting function, is just assumed here. At a very high level, Potters and Bouchaud's mode of approach is to derive an expression for $g_N(z)$ in terms of $g_{N-1}(z)$, and then invoke the assumption (i), to get a single self-consistent equation for the limiting $g(z)$. There are typically multiple solutions to these equations, but also usually only one that makes sense, so the others are ignored ***.
- At this very high level, Potters and Bouchaud derive limiting distributions of eigenvalues, and in some cases eigenvectors, for a lot of distributions of matrices with random entries: symmetric matrices with IID Gaussian entries, Hermitian matrices with complex Gaussian entries, sample covariance matrices, etc. They also develop results for deterministic matrices perturbed by random noise, and a whole alternate set of derivations based on the replica trick from spin glass theory, which I do not feel up to explaining. These are then carefully applied to topics in estimating sample covariance matrices, especially in the high-dimensional limit where the number of variables grows with the number of observations. This in turn feeds in to a final chapter on designing optimal portfolios when covariances have to be estimated by mortals, rather than being revealed by the Oracle.
- My main dis-satisfaction with the book is that I left it without any real feeling for why the eigenvalue density of symmetric Gaussian matrices with standard deviation $\sigma$ approaches $\rho(x) = \frac{\sqrt{4\sigma^2 - x^2}}{2\pi \sigma^2}$, but other ensembles have differ