Challenges of Bias in AI Models
In today’s hospitals and clinics, a dermatologist may use an artificial intelligence model for classifying skin lesions to assess if the lesion is at risk of developing into a cancer or if it is benign. But if the model is biased toward certain skin tones, it could fail to identify a high-risk patient.
Perhaps one of the best known and most persistent challenges that AI research continues to reckon with is bias. Bias is often discussed in relation to training data, but model architecture can also contain and amplify bias, negatively influencing model performance in real-world settings. In high-stakes medical scenarios, the very real consequences of poor performance have made bias into a quintessential safety issue.
Novel Debiasing Approach: WRING
A new paper from researchers at MIT, Worcester Polytechnic Institute, and Google that was accepted to the 2026 International Conference for Learning Representations proposes a novel debiasing approach called “Weighted Rotational DebiasING” (i.e., WRING) that can be applied to vision language models (VLMs), like OpenAI’s OpenCLIP.
VLMs are multi-modal models that can understand and interpret different data modalities like video, image, and text simultaneously. While debiasing approaches for VLMs do exist, the most commonly used approach is known as “projection debiasing,” which leads to what has been termed the “Whac-A-Mole dilemma,” an empirical observation that was formally introduced to AI research in 2023.
Drawbacks of Projection Debiasing
Projection debiasing is a post-processing approach that removes the undesirable, biased information from model embeddings by “projecting” the subspace out of a representation space of relationships, thereby cutting out the bias. But this approach has its drawbacks.
“When you do that, you inadvertently squish everything around,” says Walter Gerych, the paper’s first author, who conducted this research last year as a postdoc at MIT. “All the other relationships that the model learns change when you do that.”
Introducing WRING
WRING works by moving certain coordinates within the high-dimensional space of a model — the ones that appear to be responsible for bias — to a different angle, so the model can no longer distinguish between different groups within a certain concept. This changes the representation within a specific space while leaving the model’s other relationships intact. And like projection debiasing, WRING is a post-processing approach, which means it can be applied “on the fly” to a pre-trained VLM.
“People already spent a lot of resources, a lot of money, training these huge models, and we don’t really want to go in and modify something during training because then you have to start from scratch,” Gerych explains. “[WRING is] very efficient. It doesn’t require more training of the model and it’s minimally invasive.”
Results and Future Directions
In their results, the researchers found that WRING significantly reduced bias for a target concept without increasing bias in other areas. But for now, the approach is somewhat limited to Contrastive Language-Image Pre-training (CLIP) models, a type of VLM that connects images to language for search or classification.
“Extending this for ChatGPT-style, generative language models, is the reasonable next step for us,” says Gerych.
This work was supported, in part, by a National Science Foundation CAREER Award, AI2050 Award Early Career Fellowship, Sloan Research Fellow Award, the Gordon and Betty Moore Foundation Award, and MIT-Google Computing Innovation Award.Kindly read our copyright disclaimer here: https://cere-sync.com/dmca-copyrights-disclaimer/

