Yuxuan Liu, Yuntian Shi, Kun Wang, Haoting Shen, Kun Yang
CSR-Bench is a benchmark for evaluating the cross-modal safety and reliability of multimodal large language models, revealing systematic alignment gaps and trade-offs between over-rejection and safety.
Researchers have developed a new benchmark called CSR-Bench to test how well multimodal large language models (MLLMs) handle both text and images together. These models often fail to understand joint intents across modalities, leading to issues like bias, hallucination, and safety failures. By testing 16 different MLLMs, the study found that these models struggle with aligning text and image inputs, often defaulting to text dominance and showing safety weaknesses. The study also highlights a trade-off where models that avoid rejecting inputs too much might compromise on safe and fair behavior.