Books to Read While the Algae Grow in Your Fur, September 2019

Three-Toed Sloth 2020-01-28

Summary:

Attention conservation notice: I have no taste, and no qualifications to discuss ancient history. Also, there's nothing like starting a new class to cut down on reading time.

Cathy O'Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy
This is a popular, and polemical, book about the abuses of statistical modeling and optimization in contemporary American society. It's clear, accurate (*), impassioned, and has already (since its 2016 publication) had a bit of an impact on at least academic research on these topics.
Let me indulge in the academic vice of systematizing something that doesn't really need it by imposing a number of binary distinctions on the kinds of things O'Neil discusses. On the one hand, some of them are effective for their intended purpose, and others are not; work-scheduling algorithms that make money for companies by making workers lives miserable are effective (for the companies), personality tests for job applicants are just rubbish. A cross-cutting distinction is between systems that rely on gathering extensive data and subjecting it to statistical modeling to make predictions and those which don't: the personality tests are (supposedly) relying on statistical predictions, but something like the US News and World Report ranking of colleges is just making stuff up with numbers. Finally, is the system's goal primarily to benefit those being subjected to it, or someone else? Pretty much every system O'Neil discusses is aimed at benefiting someone other than its subjects, but one could imagine a job-scheduling system which (say) tried to find workers hours which fit other demands on their time while still making sure the coffee shop had enough workers to meet customer demand. (This might involve paying people more to take bad shifts.) O'Neil's ire is mostly about the fact that the systems don't benefit those subjected to or caught up in them. Some of her criticisms are about effectiveness, but that's not really her point. If (to use one of her examples) one could come up with a system which very accurately predicted who would commit further crimes if released pending trial, based purely on their neighborhood of residence, TV shows watched, etc., O'Neil would (I think) still insist it was unjust to treat some people more harshly than others, based not on their legal record but on the conduct of those with whom they share such morally-irrelevant characteristics. If ad targeting is actually very bad at predicting what ads people will respond to, it's not clear that we should judge it more fairly.
In other words: in a lot of cases which (rightly) incense O'Neil, I can imagine replacing an elaborate statistical model with an astrologer, or a random number generator, and they'd still be outrageous. So the statistical models / big data / data mining / machine learning / "artificial intelligence" isn't really the issue; it's the (attempt at) exploitation and manipulation. Using computers is important here, because it makes exploitation and manipulation more scalable, but the use of statistical modeling is often a secondary concern, though the idea of accurate prediction may be important to the manipulation.
Disclaimer: O'Neil and I know each other slightly, and had an exchange about the distinction (if any) between "data science" and "statistics", back in the Late Bronze Age of blogging.
Addendum, January 2020: Having assigned (most) of the book to my data mining students last semester, I can say that it went over quite well, and I will be using it again when I teach the course.
*: There's a bit where she tries to explain feature selection for predictive modeling, where she glosses "taking a Bayesian approach" as ranking features by importance, which comes across confusedly, but I know she knows better, and I think attempting to explain this without math just resulted in some editing mush.
John W. Lee, The Persian Empire
Lively, modern class on the Achaemenids; good about not just taking the Greek viewpoint(s).

Books to Read While the Algae Grow in Your Fur; Enigmas of Chance; Writing for Antiquity

Link:

http://bactra.org/weblog/algae-2019-09.html

From feeds:

Statistics and Visualization ยป Three-Toed Sloth

Tags: