President Donald Trump plans to collect a lot more data about crimes committed by immigrants. This will inevitably give him a weapon to use against them, thanks to a peculiarity of crime statistics: If you look for something, you’ll almost always find more of it.
Trump recently started two initiatives focused on crime. He has promised to create a new office in the Department of Homeland Security, the Victims of Immigrant Crime Engagement, to collect data on the transgressions of immigrants. And in his revised executive order halting visas and refugees from certain countries, he called for a public database on “honor killings,” defined as gender-based violence against women by foreign nationals.
It’s hard to get to the truth about crime. One could even argue that we don’t really have crime data at all. Rather, we have information on arrests and reports, neither of which are great proxies for actual crimes. A lot of criminal activity — drug use, small-time theft, trespassing, turnstile jumping — never gets recorded unless a police officer happens to be present. Most rapes go unreported, and as much as a third of all homicides are never solved.
The incompleteness of the data means that what we decide to collect can have a big impact on what we see. If we spend a lot of time and energy finding and documenting crimes committed by a certain subpopulation, we’ll naturally increase its prominence. This wouldn’t mean that such people are more criminal. They’re simply getting a different level of scrutiny.
Consider how police departments have focused on nuisance crimes in poor and minority neighborhoods — part of a broader strategy known as broken-windows policing. Blacks ended up getting arrested for smoking marijuana a lot more often than whites — even though people of both races actually use the stuff at about the same rate. Similarly, the Chicago Police Accountability Task Force found that black drivers were much more likely than white drivers to be stopped on suspicion of carrying contraband, even though they were less likely to be actually carrying illegal goods.
Despite the obvious flaws in arrest data, we still use them in designing policies. Police departments send more officers to areas where they make the most arrests. Perhaps you’ve already heard the statistic that immigrants are involved in less crime than native-born Americans (despite concerns over gang activity).
If we start overscrutinizing immigrants from Muslim-majority countries, the numbers might well change to their detriment, giving the Trump administration the fodder it needs to engage in yet more profiling purported to ensure the nation’s security. Judges consider previous arrests in deciding how harshly to sentence. Computer algorithms use the data to predict where crimes will occur (“predictive policing”), how much bail to demand and whether to free prisoners on parole (“recidivism risk”).
All those decisions are as biased as the data on which they are based — an ongoing problem for poor people and minorities, who find themselves increasingly surveilled and incarcerated.
To be fair, and to be scientific about it, we should choose another subpopulation for equal focus, so we can measure the effects of our added attention. I suggest starting with politicians.
Cathy O’Neil is a mathematician who founded ORCAA, an algorithmic auditing company, and is the author of “Weapons of Math Destruction.” She wrote this for Bloomberg View.