Breakthrough research from Stanford Law School is shedding new light on how large language models (LLMs) like ChatGPT exhibit racial biases, yielding troubling disparities in their outputs based solely on whether a name is perceived as belonging to a white or Black individual.
In the study, titled “What’s in a Name? Auditing Large Language Models for Race and Gender Bias,” Stanford Law Professor Julian Nyarko and his co-authors employed an audit design commonly used to measure bias in domains like housing and employment. They constructed a series of prompts involving different scenarios, such as buying a used car or predicting outcomes of chess matches and elections. The only variable they altered in each prompt was the name of the individual, using names highly associated with Black or white races based on prior research.
What Nyarko’s team found was striking. Across an array of contexts, the LLMs consistently advantaged names perceived as white while disadvantaging those perceived as Black, especially Black women. According to the Stanford study, large language models exhibited particularly stark biases against names associated with Black women, yielding the least favorable outcomes across various contexts. This suggests that the intersectionality of race and gender compounded the disparities, with Black women facing a “double jeopardy” of discrimination.
For instance, when asked how much one should offer to buy a used bicycle from someone named “Jamal Washington” versus “Logan Becker,” the model suggested paying Jamal, the Black individual, only half as much as Logan, the white individual. Similar racial disparities emerged in prompts related to hiring, elections, sports, and more.
As Nyarko explains, these biases arise from how LLMs are trained on massive datasets that reflect real-world inequities. If a model is trained partly on Craigslist posts where Black sellers tend to list used cars at lower prices than white sellers, it will learn that association and reproduce it in its own outputs. In effect, the models assimilate stereotypical connections between race and socioeconomic status, intellectual ability, popularity, and other attributes.
More context in the prompts sometimes lessened but didn’t eliminate the biases. For the car buying scenario, specifying that the vehicle was a “2015 Toyota Corolla” narrowed the racial gap in suggested purchase prices but didn’t close it entirely. Notably, the one factor that did eliminate bias was providing a quantitative price anchor, like the Kelley Blue Book value. With that numerical grounding, the LLM gave consistent responses regardless of implied race.
These results illuminate a subtle but insidious form of algorithmic unfairness that emerges even when race isn’t explicitly mentioned. Unlike cases of overt discrimination that LLMs can be programmed to avoid, these biases operate under the surface, skewing outputs based on proxies like names. Crucially, users receiving biased results are typically unaware of the tilted playing field.
So what does this mean for the millions of Black individuals increasingly relying on LLMs for tasks like writing assistance, analysis, question-answering, and more? Put simply, it means potentially receiving inferior service for no reason other than their name. Whether asking for essay-writing tips, coding help, or career advice, Black users are at risk of getting skewed outputs that put them at a disadvantage.
Of course, bias in LLMs isn’t a new revelation, and responsible AI developers have implemented various measures to mitigate unfairness. Many LLMs now have safeguards to avoid generating overtly prejudiced content. Some can even be prompted to flag their own potential biases. But as this study highlights, that’s not enough. Models still amplify real-world inequities in subtle ways, and relying only on reactive, piece-by-piece fixes leaves many issues unaddressed.
What’s needed is a fundamental shift in how the AI community approaches this problem. Proactively auditing models for disparities using techniques like Nyarko’s should be an industry standard, not an afterthought. When biases are found, addressing them head-on with algorithmic solutions and more inclusive training data should be a top priority.
There are also deeper questions to grapple with around using attributes like names as demographic proxies. From a narrow technical perspective, factoring in socioeconomic status when giving financial advice, for instance, could yield more tailored guidance that improves user satisfaction. But that doesn’t mean using names – inaccurate proxies laden with historical baggage – is acceptable. These determinations demand thoughtful human judgment, not implicit algorithms.
For now, Black users should be aware that mainstream LLMs come with an asterisk. Even when not mentioning race directly, simply including one’s name risks activating latent associations and skewing the quality of service. Demanding transparency around model audits and pushing developers to uphold fairness as a core principle can help. But progress will require a collective effort.
The rise of accessible LLMs presents transformative opportunities to democratize knowledge and empower marginalized communities. But realizing that potential means confronting uncomfortable truths about how technologies can perpetuate society’s biases and working diligently to chart a more just path forward. What’s in a name? For LLMs and the people using them, the answer is still too much. Recognizing that is the first step to building artificial intelligence that works equally well for everyone.