Remember how—this is a really brief history lesson–remember how the NSA, CIA, FBI, and many other of those lettered agencies with their ever-increasing budgets, super-sophisticated computers, genius brain mathematicians, statisticians, and computer scientists were able, through thinking clever thoughts and applying unimaginably exquisite algorithms, were able to identify and thus prevent 9/11?
And how they took the knowledge they gained through that success, in combination with even and ever more data culled from private citizens, to stop the Boston bombing before it happened, to thwart the Fort Hood killings, to stop Benghazi, to defeat the Wisconsin Sikh temple shooting, to discover who sent anthrax through the mail, and how they similarly protected us from a few dozen other bloodlettings?
Neither do I remember.
These gross failures have an explanation; actually two. The first falls under the “If only” rubric. If only the budgets of these bureaucracies weren’t so meager we would have been safe. If only they were allowed a freer hand. If only they were able to gather more information. If only endlessly.
The hard news about If only is that there is some truth in it. An algorithm designed to ferret out fiends from phone records won’t work if there are no phone records. Solution? Get some. And, the logic pushed way past the breaking point and over the Cliffs of Insanity (free Princess Bride reference), if you have some all is better. So get them all, and be damned the morality of the act. Lives are on the line. Think about the children.
This line of reasoning is convincing to politicians anxious to increase their grope of the body politic. This is why Dianne Feinstein trotted to the microphones yesterday to say that “I knew about these programs all along. Congress was briefed.” The implication is that since one budget-consuming branch of the government knew what another budget-consuming branch of the government did, there was nothing to worry about. Feel better?
The second explanation for the intelligence strike out slump is more disheartening and more likely true. The methods the NSA and others are using don’t work.
Or they don’t work too well. Why? Because there is no task more fraught with error, misunderstanding, and misplaced certainty than predicting human behavior. Just when you think you have it pegged, it changes.
We can, for instance, forecast with reasonable accuracy and on most weekdays that the bulk of New York City residents will pile into cars, cabs, and trains round about 5 pm. But this is only most weekdays. Sometimes, for reasons foreseen by nobody, the regular pattern is disrupted and the forecast goes to Hades.
How much harder is it then to predict what time Mr Smith heads home for the day? And then ask our model to discern whether Smith will vary his route next Tuesday to toss an IED into the Army recruiting center (a real example). You can ask, and the computer will answer, but it will be talking out its bus.
It won’t be long before some disingenuous politician or bureaucrat will say, “Theses agencies may have been too zealous. But come on. If you have nothing to hide, you have nothing to worry about.”
Bovine spongiography. You have plenty to worry about. Like being falsely suspected or accused. Like having the full might and weight of the Federal Government bear down on you, as the IRS currently does to those it perceives as its enemies. Financial ruin, loss of reputation, careened career—and worse.
Look. Take the smartest guys you can find and have them cobble together a model which predicts whether each citizen is a “potential” terrorist or no. This model will spit out numbers in the form of probabilities. Some of these will be high enough to exceed the “reasonable, articulable suspicion” threshold (the quote is from Feinstein).
At that point even an editorial from the New York Times won’t be able to convince our great brains that they might have the wrong men. The allure of numbers from a computer printed out and physically real is too strong, even for the people who programmed these computers and who know intimately their many and weep-worthy limitations.
The nature of these models is that there are bound to be many, many more false identifications of terrorists than true ones. The price we pay for these errors is a loss of privacy, and the placement of our secrets into the hands of government. What could go wrong? Everything.
Oh, the algorithms will also claim some aren’t a threat when they really are. This we know from all too frequent experience. Yet the confidence of government that success is ever around the corner never abates.
Update Just saw this in today’s WSJ “Thank You For Data Mining”:
The effectiveness of data-mining is
inverselyproportional to the size of the sample, so the NSA must sweep broadly to learn what is normal and refine the deviations.
This is false. The more people/records searched the greater the number of false positives, the costlier the subsequent follow-up searches (each potential terrorist has to be investigated further), and the more eventual lives harmed. The follow-up searches are also not error free. Blanket screening is rarely a good idea.
The many who are writing editorials today praising the data collection and computer screening have faith but no experience in statistic modeling.
Update 2 The “inversely” was a typo/mistake of the WSJ editors. They have since removed it from the online version; the mistake remains in the print version.