Today’s predictive technology introduces a new risk of prejudice on a massive, automated scale. Predictive analytics predicts what each person will do so that companies and government agencies can operate more effectively. In some cases, this technology does drive life-changing decisions. Large organizations like Hewlett-Packard inform human resource decisions by predicting whether each employee is likely to quit, and states such as Oregon and Pennsylvania analytically predict whether each convict will commit crime again (recidivism) in order to make sentencing and parole decisions.
Given this influence on the lives of individuals, predictive analytics introduces a new risk of prejudice in two ways:
1. Prediction of minority status. Fueled with data, computers automatically detect one’s minority status. A new study from the University of Cambridge shows that race, age, and sexual orientation can be accurately determined by one’s Facebook likes. The capacity to predict grants marketers and other researchers access to unvolunteered demographic information. Some such personnel may be keen on managing and using this information appropriately, but have not necessarily been trained to do so.
2. Prediction with minority status. When utilizing predictive analytics, it is difficult to avoid incorporating minority status into the predictive model as one basis of prediction. There is no place this threat is more apparent than in law enforcement, where computers have become respected advisers that have the attention of judges and parole boards.
While science promises to improve the effectiveness of law enforcement, when the organization formalizes and quantifies decision making, it inadvertently instills existing prejudices against minorities. Why? Because prejudice is cyclical, a self-fulfilling prophecy, and this cycling could be intensified by the deployment of predictive analytics.
Crime prediction systems calculate a criminal’s probability of recidivism based on things like the individual’s age, gender, and neighborhood, as well as prior crimes, arrests, and incarcerations. No government-sponsored predictive models explicitly incorporate ethnic class or minority status.
However, ethnicity creeps into the model indirectly. Philadelphia has developed a recidivism prediction model that incorporates the offender’s ZIP code, known to highly correlate with race.
Similarly, terrorist prediction models factor in religion. Levitt and Dubner’s book SuperFreakonomics details a search for suspects among data held by a large U.K. bank (not a government-sponsored analysis). Informed in part by attributes of the 9/11 perpetrators, as well as other known terrorists, a fraud detection analyst at the bank pinpointed a very specific group of customers to forward to the authorities. This micro-segment was defined by factors such as the types of bank accounts opened, existence of wire transfers and other transactions, record of a mobile phone, status as a student who rents, and a lack of life insurance (since suicide nullifies the policy). But, to get the list of suspects down to a manageable size, the analyst filtered out people with non-Muslim names, as well as those who made ATM withdrawals on Friday afternoons—admittedly a proxy for practicing Muslims.
But even if such factors are kept out of predictive models, it’s still difficult to avoid involving minority status. Bernard Harcourt, a professor of both political science and law at the University of Chicago, and author of Against Prediction: Profiling, Policing, and Punishing in an Actuarial Age, told The Atlantic that minority group members discriminated against by law enforcement, such as by way of profiling, are proportionately more likely to show a prior criminal record (e.g., since they may be screened more often), which artificially inflates the minority group’s incidence of criminal records. Rather than race being a predictor of prior offenses, prior offenses (prosecutions) are indicative of race. By factoring in prior offenses in order to predict future crimes, “you just inscribe the racial discrimination you have today into the future.” It’s a cyclical magnification of prejudice’s already self-fulfilling prophecy.
Even Ellen Kurtz, the research director at Philadelphia’s Adult Probation and Parole Department and a champion of crime-predicting computer models, admits, “If you wanted to remove everything correlated with race, you couldn’t use anything. That’s the reality of life in America.”
There’s no clean way out. Although designed for the betterment of decision making, data mining inadvertently digs up some dirty injustices. In principle, the math getting us in trouble could also remedy the problem by quantifying prejudice. But that could be done only by introducing the very data element that so far remains outside the analysis, albeit inside the eye of every profiling police officer: race.
Eric Siegel, Ph.D., is the founder of Predictive Analytics World (www.pawcon.com)—coming in 2013 to Toronto, San Francisco, Chicago, Washington D.C., Boston, Berlin, and London—and the author of Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die (February 2013, published by Wiley). For more information about predictive analytics, see the Predictive Analytics Guide (www.pawcon.com/guide).