Guanzong Wu, Zihao Zhu, Siwei Lyu, Baoyuan Wu
The paper introduces a novel framework using Toxicity Association Graphs to detect covert toxicity in multimodal data, offering a new metric for measuring hidden toxicity and outperforming existing methods in interpretability and detection accuracy.
Detecting hidden toxicity in content that combines different types of media, like text and images, is challenging because harmful meanings can emerge only when these elements are viewed together. This research introduces a new method using graphs to model how seemingly harmless elements can combine to create toxic meanings. The researchers created a new metric to measure how well-hidden this toxicity is and developed a dataset to test their approach. Their method not only detects hidden toxicity more accurately than existing techniques but also provides clear explanations of how it reaches its conclusions.