Courts To Use Risk Scores More Frequently. Analysis Found Scores Unreliable And Racial Bias
Wednesday, May 25, 2016
ProPublica investigated the use of risk assessment scores by the courts and justice system in the United States:
"... risk assessments — are increasingly common in courtrooms across the nation. They are used to inform decisions about who can be set free at every stage of the criminal justice system, from assigning bond amounts... to even more fundamental decisions about defendants’ freedom. In Arizona, Colorado, Delaware, Kentucky, Louisiana, Oklahoma, Virginia, Washington and Wisconsin, the results of such assessments are given to judges during criminal sentencing. Rating a defendant’s risk of future crime is often done in conjunction with an evaluation of a defendant’s rehabilitation needs. The Justice Department’s National Institute of Corrections now encourages the use of such combined assessments at every stage of the criminal justice process. And a landmark sentencing reform bill currently pending in Congress would mandate the use of such assessments in federal prisons."
Some important background:
"In 2014, then U.S. Attorney General Eric Holder warned that the risk scores might be injecting bias into the courts. He called for the U.S. Sentencing Commission to study their use... The sentencing commission did not, however, launch a study of risk scores. So ProPublica did, as part of a larger examination of the powerful, largely hidden effect of algorithms in American life. [ProPublica] obtained the risk scores assigned to more than 7,000 people arrested in Broward County, Florida, in 2013 and 2014 and checked to see how many were charged with new crimes over the next two years, the same benchmark used by the creators of the algorithm."
ProPublica analyzed data for Broward County in the State of Florida, and found the risk assessment scores to be unreliable:
"... in forecasting violent crime: Only 20 percent of the people predicted to commit violent crimes actually went on to do so. When a full range of crimes were taken into account — including misdemeanors such as driving with an expired license — the algorithm was somewhat more accurate than a coin flip. Of those deemed likely to re-offend, 61 percent were arrested for any subsequent crimes within two years."
ProPublica also found biases based upon race:
"In forecasting who would re-offend, the algorithm made mistakes with black and white defendants at roughly the same rate but in very different ways. The formula was particularly likely to falsely flag black defendants as future criminals, wrongly labeling them this way at almost twice the rate as white defendants. White defendants were mislabeled as low risk more often than black defendants."
ProPublica re-checked the analysis. Same results. Northpointe, the for-profit company that produced the Broward County, Florida risk scores disagreed:
"... it criticized ProPublica’s methodology and defended the accuracy of its test: “Northpointe does not agree that the results of your analysis, or the claims being made based upon that analysis, are correct or that they accurately reflect the outcomes from the application of the model.” Northpointe’s software is among the most widely used assessment tools in the country. The company does not publicly disclose the calculations used to arrive at defendants’ risk scores, so it is not possible for either defendants or the public to see what might be driving the disparity... Northpointe’s core product is a set of scores derived from 137 questions that are either answered by defendants or pulled from criminal records. Race is not one of the questions..."
Formed in 1989, Northpointe is a wholly owned subsidiary of the Volaris Group. Northpointe works with a variety ot federal, state, and local justice agencies in the United States and Canada. The company's website also states that it also works with policy makers.
Besides Northpointe, several companies provide risk assessment tools to courts and the judicial system. The National Center For State Courts (NCSC) provides a list of risk assessment tools (Adobe PDF).
All of this points to a larger problem suggesting risk scores still haven't been adequately studied nor techniques vetted:
"There have been few independent studies of these criminal risk assessments. In 2013, researchers Sarah Desmarais and Jay Singh examined 19 different risk methodologies used in the United States and found that “in most cases, validity had only been examined in one or two studies” and that “frequently, those investigations were completed by the same people who developed the instrument.” Their analysis of the research through 2012 found that the tools “were moderate at best in terms of predictive validity,”... there have been some attempts to explore racial disparities in risk scores. One 2016 study examined the validity of a risk assessment tool, not Northpointe’s, used to make probation decisions for about 35,000 federal convicts. The researchers, Jennifer Skeem at University of California, Berkeley, and Christopher T. Lowenkamp from the Administrative Office of the U.S. Courts, found that blacks did get a higher average score but concluded the differences were not attributable to bias."
I wonder if the biases found started in the data rather than in the algorithm. The algorithm may have been developed and tested using existing prison populations which are known to be skewed, plus overly aggressive policing via school-to-prison pipelines and for-profit prisons in many states. Both the State of Florida and Broward County have histories with school-to-prison pipelines.
Plus, It seems crazy to make decisions about persons' lives based upon scores without knowing how the scores were calculated, and without adequate research or vetting of techniques. Transparency matters.
You can follow this conversation by subscribing to the comment feed for this post.