MIT Engineers Uncover Bias in Large Language Models

MIT scientists have discovered why large language models favor beginnings and endings in texts, proposing new frameworks to eliminate this bias, significantly improving AI performance across multiple fields.

The University Network

MIT researchers have uncovered a critical flaw in large language models (LLMs) that biases them towards information at the beginning and end of documents while overlooking the middle. This tendency, known as “position bias,” was identified through an innovative theoretical framework aimed at enhancing the reliability and accuracy of these models.

In practical terms, this bias might mean that a virtual assistant sifting through a lengthy legal document could miss crucial information if it’s buried in the middle. The bias results from how these models process input data, which could lead to inconsistencies in various applications.

“These models are black boxes, so as an LLM user, you probably don’t know that position bias can cause your model to be inconsistent. You just feed it your documents in whatever order you want and expect it to work. But by understanding the underlying mechanism of these black-box models better, we can improve them by addressing these limitations,” first author Xinyi Wu, a graduate student in the MIT Institute for Data, Systems and Society (IDSS) and the Laboratory for Information and Decision Systems (LIDS), said in a news release.

Decoding the Bias

LLMs, like GPT-4 and Claude, rely on a neural network architecture known as a transformer. These transformers use a mechanism called attention, which helps the model understand the context by focusing on related words within a sequence.

However, the researchers found that design choices in how these models handle data, such as the use of attention masks and positional encodings, contribute to position bias.

The study employed a graph-based framework to analyze these design elements. Through their research, the team discovered that causal masking, a method used to limit attention to preceding words, inherently biases the model towards initial words even if those words are less important.

“Graphs are a flexible language to describe the dependent relationship among words within the attention mechanism and trace them across multiple layers,” added Wu.

These biases can be detrimental, especially in applications outside natural language generation, like information retrieval or ranking.

“While it is often true that earlier words and later words in a sentence are more important, if an LLM is used on a task that is not natural language generation, like ranking or information retrieval, these biases can be extremely harmful,” Wu added.

Experimental Insights and Future Directions

To further explore this phenomenon, the researchers conducted experiments by varying the position of the correct answer in text sequences.

They observed a “lost-in-the-middle” trend, where the model performed best when the target information was at the beginning or end of a sequence, faltering in the middle.

Their findings suggest that tweaking the design, such as using alternate masking techniques or minimizing extra layers in the attention mechanism, can mitigate this bias and enhance model accuracy.

“By doing a combination of theory and experiments, we were able to look at the consequences of model design choices that weren’t clear at the time. If you want to use a model in high-stakes applications, you must know when it will work, when it won’t, and why,” added co-senior author Ali Jadbabaie, a professor and head of the Department of Civil and Environmental Engineering, a core faculty member of IDSS and a principal investigator in LIDS.

Next, the team aims to delve deeper into the effects of positional encodings and explore how position bias might be advantageous in certain applications.

Impact and Significance

This breakthrough has wide-ranging implications. Improved LLMs could result in more reliable chatbots, fairer medical AI systems and more attentive code assistants.

By addressing position bias, this research helps make LLMs more robust and reliable across various domains.

The research will be presented at the International Conference on Machine Learning.

Source: Massachusetts Institute of Technology