Statistics

Computers Can Find Future Criminals? Why, Computers Can Do Anything!

There is a belief—a persistent and beguiling, yet false belief—that a formula exists that can predict anything. This formula will differ, it is thought, by the thing predicted, and it is certain that once the formula is fed into a computer, the future is ours to see.

You might have guessed that our experience with, at the least, quantum mechanics and chaos would have beat this notion out of our heads. But you would have been wrong. (Philosophical note: I do not claim that the future is unforeseeable by any means, but I do claim that the standard, mathematical, mechanistic view of the universe is an unproductive route for augury.)

For a symptom of over-confidence, take this example provided by reader Michael Kubat:

Before his sentence, the judge in the case received an automatically generated risk score that determined Loomis was likely to commit violent crimes in the future.

There are no such things as risk scores; they don’t exist, at least not in the sense implied by this sentence, which is taken from the article “This Guy Trains Computers to Find Future Criminals” at Bloomberg. Just like you don’t have “a probability” of being stung by a bee or hit by lightning, you do not have a risk or chance or probability that you’ll break the law.

Why? All probability, all risk, all chance is conditional. In order to have “a” probability, probability itself would have to be unconditional. Put it this way. You have “a” height, which though the units it’s expressed in are relative, is real and measurable (though even height is dependent on conditions; think of measuring yourself near a black hole, where your stature would be diminished).

Here’s another example. What’s “the” probability of drawing the Jack of Hearts? There isn’t one. There is if you assume the conditions there are fifty-two cards in a deck, just one of which is the Jack of Hearts and when drawn only one card will show. But if you change the conditions to there are twenty-two cards, etc., the probability changes (and dramatically).

All this means that it’s possible to predict this Loomis will commit a crime, but only conditional on some ad hoc model. Bloomberg says:

Risk scores, generated by algorithms, are an increasingly common factor in sentencing. Computers crunch data—arrests, type of crime committed, and demographic information—and a risk rating is generated. The idea is to create a guide that’s less likely to be subject to unconscious biases, the mood of a judge, or other human shortcomings.

Algorithms are done on computers!

The “unconscious bias” so fretted about won’t be present in the judge who relies on an ad hoc model, but it will exist in the ad hoc model, or the creators of that model. What happens is that everybody thinks the algorithm is unbiased, but this is simply impossible. The bias are the conditions, and conditions must exist for any algorithm to exist.

Richard Berk from the University of Pennsylvania, “a shortish, bald guy” and statistician, is one of the folks pushing the false view of model prowess. “Berk wants to predict at the moment of birth whether people will commit a crime by their 18th birthday, based on factors such as environment and the history of a new child’s parents.”

Of course, he can make such predictions; anybody can. You can, based on patterns in the scatter from your Fruit Loops. Whether these predictions have any skill (a word I used in its technical sense) is another matter entirely. Bloomberg rightly says Berk’s models “makes people uncomfortable”, which they should. The danger is that they’ll be assumed to be better than they are because they were made by Science on Bias-Free Computers using Machine Learning. Machines that can learn!

Accuracy? “Berk says that in his own work, between 29 percent and 38 percent of predictions about whether someone is low-risk end up being wrong.” Is this from prediction of entirely new data, or from the model fit? A guess is the later. Anyway, these dismal accuracies are not from at-birth predictions, but are contemporaneous. Predictably (get it?), Berk says “focusing on accuracy misses the point”. Yeah, sure it does. Here’s the frightening bit:

When it comes to crime, sometimes the best answers aren’t the most statistically precise ones. Just like weathermen err on the side of predicting rain because no one wants to get caught without an umbrella, court systems want technology that intentionally overpredicts the risk that any individual is a crime risk.

No no no no no no no no no! No. No. Rubbish. Rot. Nonsense.

What is always wanted is, given the conditions (i.e. the model), the actual probability of the event. No two people make the same decisions based on the weather report, and any shading of the probability one way or the other is, in effect, making the decision for somebody. Same goes for predicting pre-crime, where the decisions are (usually) more consequential. Accuracy always matters.

I have much more on these subjects in Uncertainty: The Soul of Modeling, Probability & Statistics.

Categories: Statistics

30 replies »

  1. Very cool stuff but we need a better name for it. We could call it something like “Future Crime” and we could hire special policemen who looked… I don’t know… something like Tom Cruise to go arrest people RIGHT as they were about to do the crimes (so there would be no qeustion that they were actually criminals, of course). And instead of feeding and housing them, we could come up wiht some kind of suspended animation for the really bad ones, so that they were no danger to guards or each other. What could go wrong?

  2. Science fiction author Isaac Asimov provided some foundational (get it?) work on this subject that inspired some to trust in it.
    “psychologist Martin Seligman identifies the Foundation series as one of the most important influences in his professional life, because of the possibility of predictive sociology based on psychological principles.
    Paul Krugman, winner of the 2008 Nobel Memorial Prize in Economic Sciences, credits the Foundation series with turning his mind to economics, as the closest existing science to psychohistory.
    https://en.wikipedia.org/wiki/Foundation_series

  3. Bloomberg: Risk scores, generated by algorithms, are an increasingly common factor in sentencing.

    Note the words “in sentencing”. You have a particular individual who HAS committed a crime and therefore does have a criminal record. I frankly don’t understand the objection to assigning a risk factor here. It’s not baseless. Just because some standardized assessment is made doesn’t mean the judge can’t use other information when arriving at a sentence. It IS unbiased in the sense that the assessment will be the same regardless of which judge is presiding.

    It’s also not really the same thing as the novel. You aren’t arresting and subsequently convicting anyone of a future crime. A very real problem arises, though, if the assessment is used to restrict those who have committed no crimes or exhibited criminal tendencies — like the No Fly list.

  4. It’s not assigning a risk factor that is bothersome. You can do that based on how you like the looks of the convicted. It’s the wild aspirations of the designers of the algorithm, their fantasy that they have eliminated bias rather than simply automated it, and their contempt for accuracy.

    Not to mention the shock when they learn that bias is based solely on undesired outcomes.

  5. This is basically saying the criminal is guilty of future crimes which he has not yet committed. So much for “innocent until proven guilty”. Now it’s “Commit a crime and be punished for future yet-to-be-committed crimes too”.

    DAV: This is exactly how we get people on the Sex Offenders registry because they were 18 years, 1 day old and had sex with the 15 year old they had been having sex with for a year. Until 18 years, 1 day, it was legal. Now, the person is marked for life. It denies reality. Algorithms only see what the program allows them to, in black and white. Crimes are rarely black and white.

  6. But the climate scientists claim they can write a computer program to predict the future.

  7. Sheri — Exactly. How do we put the blurriness of the example you stated into a machine. We can assign a number to it … RIGHT? UGH! That number is YOTA no matter how it is derived. That number is from the beginning biased.

    There are 12 year olds who understand. There are 40 year olds who don’t. There are a lot of 12 year olds that do not understand. There are a lot of 40 year olds that do. Defining what it is we are talking about understanding is the murky thing.

    Harriet Hall has http://www.skeptic.com/reading_room/uncertainty-in-medicine this piece in Skeptic Magazine. The way I read it, she is saying the same thing that our bending host here is trying to say. (Please feel free to think I am crazy for thinking this dear host).

  8. DAV brings up an important idea, which is that of standardization. A ‘standardized’ test is simply one that is graded the same for all. Thus, in standardized tests, there does truly exist a ‘lack of bias’ in how the test is scored. The ‘algorithm’, as it were, for scoring the test is the applied equally for all.

    The cognitive, and psychological, problem is that we are apt to conflate this one particular ‘lack of bias’ with an absolute and generalized lack of bias. But, as Matt correctly points out, bias is not and can not be eliminated by standardization/algorithm.

    It is unfortunate indeed that statisticians and Peer-Reviewed (TM) scientists commonly make this mistake.

    Separate from this error of conflation, of course, is the error of reifying ‘probability,’ so that real or possible things are misunderstood to ‘have’ it.

  9. Sheri,
    This is exactly how we get people on the Sex Offenders registry because they were 18 years, 1 day old and had sex with the 15 year old they had been having sex with for a year

    Especially if you watch TV shows — particularly those attempting social engineering. Unfortunately seems to be most of them nowadays. There are only 10 states which have set the age of consent at 18. I see Wyoming is one of them. From TV, a lot of people think the age of consent is 18 everywhere. Most states take the age differential into consideration and wouldn’t waste time on sch a case. Even in Wyoming where “6-2-316. Sexual abuse of a minor in the third degree. … Being seventeen (17) years of age or older, the actor knowingly takes immodest, immoral or indecent liberties with a victim who is less than seventeen (17) years of age and the victim is at least four (4) years younger than the actor.” Strangely, they saw the need to explain what those number names mean.

    Not sure what this has to do with risk assessments though perhaps a lot to do with prosecutorial overzealousness bordering on misconduct.

  10. Heck, I remember when the age was 21. And in former ages it was 14 (for men) and 12 (for women).

    I heard an episode of DRAGNET on the radio the other day. (One of the stations is an old-time radio station.) It involved a child abduction and sexual molestation. Heavy hitting for those days. At the end, the narrator summed up the fates of the characters and said, “The child was returned to her parents.”

    The child was 19.

  11. Sheri,

    ” This is exactly how we get people on the Sex Offenders registry because they were 18 years, 1 day old and had sex with the 15 year old they had been having sex with for a year. Until 18 years, 1 day, it was legal. Now, the person is marked for life. It denies reality.”

    Not quite.

    Two issues:

    1. The age of consent is not 18 in all states, in fact not even in a majority of states. Most states is 17 some are 16. Someone over the age of consent can have sex with anyone also over the age of consent. Statutory rape only applies when one or both are under the age of consent.

    2. Romeo and Juliet laws creating exceptions to statutory rape laws for minor/minor sex or minor/adult where the age difference is small are a relatively recent phenomenon and not all state have one yet. The first was passed in 2007.

    If your hypothetical couple are in a state that doesn’t have one yet, their relationship was illegal the entire time.

    If they were in a state with a 17 or 16 age of consent, and no close in age exception, it was illegal the entire time.

  12. I forgot to add.

    The whole push to pass Romeo and Juliet laws was because of several cases that became high profile where a minor was convicted of statutory rape and put on a sex offender registry.

  13. DAV: You seem intent on missing the point of my comment by ignoring the spirit of the comment and going for the age of consent, etc. So, to clarify, what ever age of consent (AC) exists and whatever age difference (AD) is used, if the AC is x and Fred is x plus one day, and AD is four years and his sex partner WAS not four years younger than Fred until Fred passed the AC, Fred becomes a sex offender. Whether or not this is prosecuted is irrelevant. It’s the law. It can be prosecuted.
    There’s also drinking (under 21, minor in possession/birthday=okay to get drunk as you like), smoking, driving, etc. One day changes everything. We constantly pretend that a birthdate magically makes something okay. That’s because we need standardized rules that have no relationship to reality. That’s what this algorithm is—another mindless rule.

    YOS: Exactly.

  14. That’s what this algorithm is—another mindless rule.

    What you don’t want is are situations where you have no way to predict if what could land you in trouble.

    Take the IRS for example. I read a story a while back where a pilot in Maine was using his airplane as a business. He went to his local, friendly IRS office and they told him what they expected. He later moved to Virginia; called in for an audit; and was told, “Nope! You owe back taxes”. When he pointed out what the Maine office had said the reply was “That was there. This is here”. You might expect that with different state agencies but not when dealing with an apparent single entity like the IRS which seems to have next to impossible to discover jurisdictional areas where you will be treated differently depending on which you happen to be in.

    A similar situation with occurs within state governments and not just in administration cases. In some places, certain laws are more aggressively prosecuted depending on where you live even within a given state. I’ve heard that there are four prosecutors who are responsible for the majority of death sentence cases within the last 30 years.

    So, yes, mindless or not, standardization is needed. The government is getting too large and its already written rules — not to mention the unwritten ones — are being applied capriciously. There’s supposedly a remedy called the appeals court but even there you takes your chances. Still, regardless of the outcome you are in legal Hell.

    The only real remedy is to stop making certain actions illegal. Until that happens, at least things like the idea of age differential in statutory rape cases, at the minimum, is recognizing the grey areas and attempting to resolve the close issues in a standardized way.

  15. Does not free will by itself refute the idea that the probability of a person doing crime in future can be calculated?

  16. quantum mechanics and chaos do not mean that future can not be predicted.
    1) Physics only applies to inanimate things.
    2) QM and chaos means that answer would be probabilistic.

  17. Hold on a second. The computerized offender risk score algorithm will necessarily have to take race into account. If it didn’t, as we’ve learned from recent posts on this very blog, there will be a strong bias against certain racial groups, defeating the whole purpose of freeing the process from unconscious biases.

    To be completely free of human bias, the algorithm will have to be self-optimizing, continuously adjusting the weighting parameters to achieve risk scores that are race-neutral (erring on the side of disfavoring privileged groups, for purposes of maximizing accountability).

    An excellent candidate pool with the needed skills for developing these computer programs is readily available: those developers who have cut their teeth on climate modeling.

  18. …Accuracy? “Berk says that in his own work, between 29 percent and 38 percent of predictions about whether someone is low-risk end up being wrong.”…

    Interestingly, the very concept of ‘wrong’ is itself subject to bias and is conditional on an ad-hoc model.

    Where something is mathematically wrong, that is conditional on an ad-hoc model which we currently believe is true throughout this Universe, and therefore in a sense a priori. But for a ‘social’ prediction to be ‘wrong’ – this depends on a lot of personal interpretation about what happened and why.

    You can’t avoid bias when dealing with pretty much all human decision-making. The best you can do is understand this and make some reference to the basis of your decision so that the mechanics of it are a bit more evident…

  19. JohnK: This problem became evident when blacks claimed ALL standardized tests were biased toward whites and demanded changes.

    MattS: Quite interesting. I did not know there were Romeo and Juliet laws (wonder what Shakespeare would think of that usage of his characters…..). I noted that in Illinois:  “In Illinois the age of consent is 17.  Our law says that it’s impossible for a child under the age of 17 to knowingly consent to sex.  Even if he or she voluntarily eangages in sex, even if he or she brings up the subject and suggests sex, even if he or she initiates sex . . ..  If two 15 or 16 year olds have sex they each may be charged with criminal sexual abuse of the other.” One would think, based on years of progressive and media insistence that you cannot stop teens from having sex, that any computer algorithm concerning this crime would predict near 100% chance of continued lawbreaking and recommend solitary confinement until the age of consent is reached. That or admit children can give consent and get rid of the law. (I suspect it’s rarely if ever used anyway.)

    DAV: A flat tax fixes the whole IRS problem—no algorithm needed. As for more vigorous prosecution in some states, I think that was what the founding fathers had in mind with state rights and keeping the feds from ruling the entire county. One could argue that countries have the same problem and create a world count. Progressives are drooling over that suggestion even as they read this.
    What makes you think the government rules will be applies fairly? Read up on Hillary and get back to me. You can’t make anything mandatory enough to be fairly applied in reality. I agree that many actions that are currently illegal should not be, but those that must be outlawed will still be unfairly applied no matter what.

    Milton H: Or the future need for the developers will be filled with more high-tech foreign workers Congress thinks the tech industry needs.

    Dodgy: I noticed that too. But remember, in social science, 25% appears to be enough to declare cause and the theory as being correct. Social science has a very low threshold for accuracy.

  20. ALL standardized tests were biased toward whites and demanded changes.

    Actually, they seem to have been biased in favor of Asians and Ashkenazi Jews. Those clever white test-makers.

  21. “That or admit children can give consent and get rid of the law. (I suspect it’s rarely if ever used anyway.)”

    Oh, it gets used a lot. A lot of it is older men 30+ on prepubescent kids. Which I think is proper to criminalize. The problem comes in where when the statutory laws were written, they didn’t even consider issues with late teens having sex with each other. And much of that may or may not have been an issue at the time because the age of consent has changed since.

    Prosecutions of minors for statutory rape have happened, probably mostly because of demands by irate parents of one of the kids getting in the face of police and the local prosecutor and demanding that they do something, never mind that their precious ‘innocent’ child was a full willing participant.

    Some degree of gender bias is in play too, it’s almost always the girls parents demanding that the boy be prosecuted for corrupting their daughter.

  22. The argument about age of consent and statutory rape laws seems beside the point. Isn’t the heart of the issue here that the model derives from population statistics which application to an individual, even one from that same population, constitutes a fallacy? Medical doctors make this fallacy all the time.

  23. YOS: I would go with that, though those protesting refused to believe it.

    K.Kilty: Yes, we got off topic to a degree. I was trying to make the point that rules that are inflexible can have undesirable consequences, but the issue seems to take on a life of it’s own. I couldn’t think of a different example and as a result, my point was completely obscured. I’ll try to come up with a less emotional example so we can stick to the subject.
    Good point on the doctors making the same error—population statistics may not apply to the individual.

  24. Maybe to further clarify, there are studies showing property crimes have the highest recidivism. If we use an algorithm, do we hold commiters to be in need of the most incarceration or at least supervised probation? Do we rank crimes according to how much they are assumed to harm society and then prescribe sentencing? Who decides how much crimes harm society? Feminists? You males are in trouble if they do. Environmentalists? Behaviorists? As Briggs said, there are conditions involved—likelihood to re-offend, harm to society, how much a crime is hated by the public. Whoever creates the algorithms is going to weight these things. You have a programmer deciding or maybe a committee. Either one scary.

  25. Romeo and Juliet laws date from 2007? That is news to me. I was aware of these as a teenager, myself, in the 70s. Link to the statutory definition of rape in the first degree for one state is here: http://codes.lp.findlaw.com/alcode/13A/6/4/13A-6-61, and note that the Legislative Act listed first, is 1977 (and it older that that, but I’d have to go dig up old dead-tree versions of the Code). The lesser degrees can be searched as desired.

  26. @K. Kilty, there is a reason why medical diagnoses, derived from a procedure called differential diagnosis, are called dia-gnoses. They are not unconditional knowledge. That population based statistics get misused is another issue and one that government driven ‘evidence’ based medicine may, under certain conditions, kill people, lots of people; and that there are people who labor under the delusion of Equality, who don’t mind that the ‘undesirables’, however defined and mostly post-hoc, do lose their material, wet chemistry, form of life.

  27. We used a Level of Supervision Inventory for many years to determine risk to reoffend. The inventors had a lengthy check list but it really came down to three predictors – age, gender, and occupation. Of course the fourth – race was excluded from the ‘conditions’ because the inventory would be exposed for what it was – a bias means of generalizing about an individual with no degree of certainty about how they would turn out. There are many ways we delude ourselves into believing we can characterize and predict humans – my pet peeve – psychopathy.

Leave a Reply

Your email address will not be published. Required fields are marked *