Probability paradox may be killing thousands
Statistical Modeling, Causal Inference, and Social Science 2016-02-25
Brian Kinghorn points to this news article by Christian Grothoff and J. M. Porup, “The NSA’s SKYNET program may be killing thousands of innocent people; ‘Ridiculously optimistic’ machine learning algorithm is ‘completely bullshit,’ says expert.” The article begins:
In 2014, the former director of both the CIA and NSA proclaimed that “we kill people based on metadata.” Now, a new examination of previously published Snowden documents suggests that many of those people may have been innocent.
Last year, The Intercept published documents detailing the NSA’s SKYNET programme. According to the documents, SKYNET engages in mass surveillance of Pakistan’s mobile phone network, and then uses a machine learning algorithm on the cellular network metadata of 55 million people to try and rate each person’s likelihood of being a terrorist.
The news displays some leaked documents labeled Top Secret. I don’t know if it’s legal for me to copy them here, but one of them says, “0.18% False Alarm Rate at 50% Miss Rate.” Grothoff and Porup write:
A false positive rate of 0.18 percent across 55 million people would mean 99,000 innocents mislabelled as “terrorists” . . . The leaked NSA slide decks offer strong evidence that thousands of innocent people are being labelled as terrorists; what happens after that, we don’t know.
Kinghorn writes:
I find this quite disturbing. I’m betting a lot can be chalked up to the Base Rate Fallacy. If Pr(being terrorist) < Pr(flagged by model) Then Pr(terrorist | flagged) < Pr(flagged | terrorist)
The post Probability paradox may be killing thousands appeared first on Statistical Modeling, Causal Inference, and Social Science.