Md. Faiyaz Abdullah Sayeedi, Md. Mahbub Alam, Subhey Sadi Rahman, Md. Adnanul Islam, Jannatul Ferdous Deepti, Tasnim Mohiuddin, Md Mofijul Islam, Swakkhar Shatabda
This study introduces Translation Tangles, a framework to evaluate translation quality and fairness in multilingual LLMs, highlighting performance and bias issues across languages and domains.
Large Language Models (LLMs) have significantly improved machine translation, offering fluent translations across many languages and domains. However, these models often struggle with consistency in performance, particularly with less common languages and specialized topics. They can also perpetuate biases found in their training data, raising fairness concerns. To tackle these issues, the researchers developed Translation Tangles, a tool for assessing translation quality and fairness, and created a bias-annotated dataset to help improve LLMs' performance and equity.