African American English (AAE) influences LLMs towards discrimination

Bias has always been a problem in AI, but a new study shows that it’s covertly integrated into language models with potentially catastrophic consequences.

In what has already been heralded as a landmark study, a team of researchers, including Valentin Hofman, Pratyusha Ria Kalluri, Dan Jurafsky, and Sharese King, documented how large language models (LLMs) discriminate against African American English (AAE).

In short, the study tests how different spelling and dialects affect LLMs’ behavior. It probes whether certain dialects and word usage influence an LLM’s behavior, focusing on bias and discrimination.

We know that LLM outputs are highly sensitive to the input. Even small deviations in spelling and style can influence outputs.

But does this mean certain inputs – e.g., those typed in AAE – produce biased outputs? If so, what are the possible consequences?

To answer these questions, the researchers analyzed the prejudices held by a total of 12 LLMs against AAE, revealing biases that match or exceed those typically held by humans. The study is available on ArXiv.

The researchers then applied their findings to societal domains such as employment and criminal justice, where AI decision-making is becoming more common.

Hofmann described the study methodology on X: “We analyze dialect prejudice in LLMs using Matched Guise Probing: we embed African American English and Standardized American English (SAE) texts in prompts that ask for properties of the speakers who have uttered the texts, and compare the model predictions for the two types of input.”

We analyze dialect prejudice in LLMs using Matched Guise Probing: we embed African American English and Standardized American English texts in prompts that ask for properties of the speakers who have uttered the texts, and compare the model predictions for the two types of input. pic.twitter.com/drTco67Ean

— Valentin Hofmann (@vjhofmann) March 4, 2024

This method allows the team to directly compare the responses of LLMs to AAE versus SAE inputs, unmasking the covert biases that would otherwise remain obscured.

The study’s findings are unsettling, to say the least.

Hofmann notes, “We find that the covert, raciolinguistic stereotypes about speakers of African American English embodied by LLMs are more negative than any human stereotypes about African Americans ever experimentally recorded, although closest to the ones from before the civil rights movement.”

We find that the covert, raciolinguistic stereotypes about speakers of African American English embodied by LLMs are more negative than any human stereotypes about African Americans ever experimentally recorded, although closest to the ones from before the civil rights movement. pic.twitter.com/07LgUY2bCj

— Valentin Hofmann (@vjhofmann) March 4, 2024

This suggests that the biases present in LLMs are not merely reflections of contemporary stereotypes but are more aligned with prejudices that many believed society had moved beyond.

One of the most concerning aspects of the study is the specific linguistic triggers of bias.

Hofmann elaborates, “What is it specifically about African American English texts that evokes dialect prejudice in LLMs? We show that the covert stereotypes are directly linked to individual linguistic features of African American English, such as the use of ‘finna’ as a future marker.”

This indicates that the prejudice is not just against the use of AAE in general but is tied to the distinct linguistic elements that characterize the dialect.

What is it specifically about African American English texts that evokes dialect prejudice in LLMs? We show that the covert stereotypes are directly linked to individual linguistic features of African American English, such as the use of “finna” as a future marker. pic.twitter.com/JhPhX7ZE5U

— Valentin Hofmann (@vjhofmann) March 4, 2024

The potential for harm

The potential for harm from such biases is immense. Previous studies have already demonstrated how AI systems tend to fail women, darker-skinned individuals, and other marginalized groups.

Before the last few years, AI systems risked being trained on unrepresentative datasets. Some, like MIT’s Tiny Images, created in 2008, were later withdrawn due to sexism and racism.

One influential 2018 study, Gender Shades, analyzed hundreds of ML algorithms and found that error rates for darker-skinned women were up to 34% greater than for lighter-skinned males.

The impacts are stark, with healthcare models exhibiting high rates of skin cancer misdiagnosis among those with darker skin tones and prejudiced predictive policing models disproportionally targeting black people.

We’ve already observed unequivocal proof of AI’s increasing use across the public sector, from crime and policing to welfare and the economy. Addressing fundamental bias in sophisticated AI systems is absolutely critical if this is to continue.

Building on this research, Hofman’s team investigated how LLM bias could impact several hypothetical scenarios.

Hofman shared, “Focusing on the areas of employment and criminality, we find that the potential for harm is massive.”

Specifically, LLMs were found to assign less prestigious jobs and suggest harsher criminal judgments against speakers of AAE.

First, our experiments show that LLMs assign significantly less prestigious jobs to speakers of African American English compared to speakers of Standardized American English, even though they are not overtly told that the speakers are African American. pic.twitter.com/t5frzzzwJB

— Valentin Hofmann (@vjhofmann) March 4, 2024

Hofmann warns, “Our results point to two risks: that users mistake decreasing levels of overt prejudice for a sign that racism in LLMs has been solved when LLMs are in fact reaching increasing levels of covert prejudice.”

Second, when LLMs are asked to pass judgment on defendants who committed murder, they choose the death penalty more often when the defendants speak African American English rather than Standardized American English, again without being overtly told that they are African American. pic.twitter.com/8VBaCXfNEi

— Valentin Hofmann (@vjhofmann) March 4, 2024

The study also determines that erasing these problems is technically challenging.

The authors write, “We show that existing methods for alleviating racial bias in language models such as human feedback training do not mitigate the dialect prejudice, but can exacerbate the discrepancy between covert and overt stereotypes, by teaching language models to superficially conceal the racism that they maintain on a deeper level.”

It’s feasible to think these biases apply to other dialects or cultural-linguistic variations. More research is needed to understand how LLM performance varies with linguistic inputs, cultural use patterns, etc.

The study concludes with a call to action for the AI research community and society at large. Addressing these biases is paramount as AI systems become increasingly embedded across society.

However, to date, the inherent and systematically embedded bias of some AI systems remains a problem that developers are ready to pass over in their race for AI supremacy.

Source Link

African American English (AAE) influences LLMs towards discrimination

The potential for harm

Overcoming LLM Hallucinations Using Retrieval Augmented Generation (RAG)

OpenAI says Musk only ever contributed $45 million, wanted to merge with Tesla or take control

Related Posts

Leave a Comment Cancel Reply