Up Next


New study on COVID-19, race and prison populations shows the power of ‘big data’ in the fight for equality

We need more Black researchers with the training to unmask society’s ills

Preparing for the world’s fair held in Paris in 1900, scholar W.E.B. Du Bois and colleagues created a series of beautiful stylized graphs that used data to illustrate the state of Black America. More than a century later, this project is still inspiring Black scholars to use data as an instrument in the fight for equality.

Unfortunately, Black people remain underrepresented in modern fields relevant to the era of big data such as data analytics, statistics, mathematics and computer science. Yet the tools these fields use are increasingly necessary to understand the complexity of the modern world, from medical research to economic policy to public health. Now is the time for Black and brown communities to lean into data science as one of our best weapons in the quest to uncover and conquer the legacy of racism in American institutions.

In a new study published Wednesday in the journal Nature, I and colleagues at Yale, Northeastern, Harvard and the Santa Fe Institute examined how the coronavirus pandemic influenced racial dynamics in the U.S. prison system. To do this, members of the team (led by Brennan Klein) assembled an enormous data set on the population dynamics of prisons in all 50 states and Washington over the last 20 years. This data revealed a shocking pattern: While the pandemic led to the largest overall drop in the prison population in American history (a decline of more than 17%), the proportion of incarcerated Black people increased in nearly every state (and roughly 1% nationally). Our research team — comprising network scientists, mathematical biologists, physicians, historians, and others — set out to understand why.

While admissions to state and federal prisons dropped significantly during lockdown, the differences were minimal between racial groups, with white individuals experiencing a smaller decrease in admissions to prisons than Black people in the pandemic era. Even more surprising, the pattern of who got out due to targeted release (because of COVID-19 illness risk) do not appear to be disproportionate enough in favor of white prisoners to be a main explanation for the post-lockdown increase in the proportion of Black and brown incarcerated individuals.

Now is the time for Black and brown communities to lean into data science as one of our best weapons in the quest to uncover and conquer the legacy of racism in American institutions.

How, then, do we explain the results? We believe that differences in sentencing patterns between Black and white individuals are one of the main culprits. Here’s what we think happened: With the pandemic lockdowns affecting the operations of the criminal court system like almost everything else in the country, prison admissions declined significantly across the board, leading to a net loss of individuals from the system due to people being released. There were two paths to releases: those arising from specific policies aimed at reducing exposure to the coronavirus, and standard releases for individuals who completed their sentences. Because Black people are sentenced for longer on average. they were overrepresented among those who remained in the prison system during the pandemic lockdown period. Besides issues of fairness in sentencing, this is a public health concern, because it means they were at an increased risk of infection from the coronavirus.

Why is this finding so surprising? Because the race disparity that appeared after lockdown was not the result of an obvious difference in how Black and white individuals were admitted. That is, even when many of the forces driving who arrive and leave prison appear relatively race-neutral, there are other pernicious forces (in this case, the racial difference in sentencing) that contribute to a prison disparity. And we were only able to identify how this happened using the tools of big data.

In some ways, the fact that a subgroup of Black people (this one in federal and state prisons) were negatively affected by the pandemic is unsurprising: Disparities in the impact of COVID-19 have been one of the central themes of the pandemic. And the larger idea that the forces that stratify society manifest in the impact of disease goes well beyond COVID-19.

The challenge to understanding these disparities and their causes involves scarcity in two related, critical areas: (1) quality data and (2) data scientists who can interpret it.  

Big data on social phenomena is often generated by government institutions, companies, think tanks, academic departments and other bodies that track the status of populations for any number of reasons. And they often fail to capture information about those most disenfranchised populations. For example, while crime and policing are the focus of a lot of attention, it can be challenging to find high quality data for individuals who are incarcerated.

One of the keys to fixing this problem — the existence of what I call “data deserts” for poor and Black and brown populations — is for more scholars, activists, and thinkers from affected communities (and allies) to use data science as a weapon for social progress. Building a strong workforce of Black data scientists will not only lead to more powerful analysis, but, as we showed, to the generation of better data. This positive feedback loop will drive rapid societal change.

This is not a new idea: It was a motivation for many of Du Bois’ intellectual works. One of his masterpieces, The Philadelphia Negro (1899), highlighted Black life in Philadelphia using data as a tool. And thankfully, we need not go back a century to find courageous thinkers who use data science and related quantitative methods to ask questions highly relevant to communities of color. Contemporary scientists such as Ruha Benjamin, Elaine Nseosie, Lorin Crawford, and Keolu Fox use data-driven approaches to research the role of racism in algorithms, public health, and human genomics. Collectives such as the Black Women in Computational Biology provide support for up-and-coming scholars.

This is inspiring because Black scientists are especially underrepresented in quantitative and computational subfields, even relative to other science, technology, engineering and math fields. I believe that these skills are essential to “lift the veil” on how forces such as racism and classism shape our society, creating unequal outcomes in prison, policing, education, health care, and many other areas. For me, this moment is an exciting one: We have a chance to use these new methods and amplify new voices in the next fight for civil rights.

Du Bois once spoke about his adventures: “Crucified on the vast wheel of time, I flew round and round with the Zeitgeist, waving my pen and lifting faint voices to explain, expound and exhort; to see, foresee and prophesy, to the few who could or would listen.”

In the past, scholars like Du Bois and countless others relied on the pen, the sermon, and the picket sign to speak truth to power. It is my great hope that the work of data scientists will demonstrate that we can do the same using equations and computer code.

C. Brandon Ogbunu, a New York City native, is a computational biologist at Yale University. His popular writing takes place at the intersection between sports, data science, and culture.