Skip to main content

Unmasking bias in healthcare data

Unmasking bias in healthcare data:  A conversation with Michelle Birkett
Northwestern expert explorees how biased data systems perpetuate health inequities and the pathways toward more equitable solutions

Northwestern faculty member Michelle Birkett

Northwestern’s Michelle Birkett, PhD, is a leading researcher dedicated to addressing the health disparities faced by marginalized populations, particularly sexual and gender minorities (SGM) and racial minorities. As director of the Center for Computational and Social Sciences in Health (COMPASS) at Northwestern University’s Institute for Artificial Intelligence in Medicine, Dr. Birkett’s research focuses on uncovering how bias embedded in healthcare data systems reinforces and perpetuates existing health inequities. Her work challenges the assumption that data is neutral and highlights the critical need to examine how missing or incomplete data on minority populations can obscure health outcomes and hinder efforts toward equity.

Dr. Birkett’s research explores the social and environmental factors—such as racism and homophobia—that drive health disparities. She emphasizes the importance of capturing data not only on individuals but also on the broader societal structures that influence health. Through her work, she advocates for transdisciplinary approaches and collaboration with community members to address biases in data collection and analysis, ensuring that research leads to actionable insights that promote health equity.

We recently spoke with Birkett about her work.

Question: How does bias in big data, particularly within healthcare, reinforce existing disparities in SGM and racial minority health outcomes?

Michelle Birkett: Data is often assumed to be free of bias, but data systems only mirror the existing social world. Furthermore, humans determine what data is captured, assembled, and analyzed – and each of these decisions shapes the underlying structure and limits the questions that can be asked. For example, one common problem is that data on race, sexual orientation, or gender identity is frequently missing. So, while we know that minority populations and those at the intersection of multiple marginalized identities experience massive disparities across a broad range of health outcomes, if data isn’t captured on these identities, then the health inequities of those populations are rendered invisible.  Furthermore, many datasets focus only on individuals and their behaviors versus the upstream drivers of health inequities. So, while identity is important to measure, we must go beyond individual identity to begin capturing data on marginalized people's experiences, which shape their health. Race itself isn’t a ‘risk factor,’ but racism is. And the ways in which racism, homophobia, or other forms of bias structure society are numerous.  

Question: Can you say more about how social and contextual factors such as the neighborhood environment or societal biases (e.g., racism, homophobia) shape health outcomes, and what role these mechanisms play in addressing or perpetuating health inequities?

Birkett: The health disparities we measure at the population level are only projections of the social and environmental experiences that influence the health and wellbeing of minority individuals. So it is not racial or sexual and gender minority individuals themselves – it is peer, family, and societal reactions to marginalized people that shape their health and expose them to risk. This might look like a lack of social support or increased rejection by family members. This might also look like having access to neighborhoods with lower air quality, living in homes with poorer water quality, or attending the poorest schools with the lowest resources. My work is focused on HIV.  The individual-level risk factors typically associated with the disease (e.g., high-risk sexual behaviors, more sexual partners, illicit drug use) are not predictive of the immense racial disparities. However, there is growing evidence that the people and places marginalized individuals have access to differ from [those available to] majority populations. These differences in social and contextual environments pool risk and provide fewer resources.

Question: What strategies can researchers use to mitigate this bias in their work?

Birkett: One of the most important things researchers can do is to acknowledge their own limitations and blind spots – and to seek collaborations with individuals who hold complementary expertise. In particular, researchers must find meaningful ways to bring in those with lived experience and community members whose voices might otherwise be missing. We also need transdisciplinary collaboration that joins social scientists and health researchers with in-depth knowledge of the populations and the systems under investigation, and investigators who are at the forefront of innovative data collection and analytic techniques. Our work is better when it is done together.    — Matt Golosinski