Zhongxiang Sun, Yi Zhan, Chenglei Shen, Weijie Yu, Xiao Zhang, Ming He, Jun Xu
This paper identifies and addresses the issue of personalized language models generating incorrect answers due to personalization, proposing a solution to maintain factual accuracy while preserving personalization.
Personalized language models are designed to tailor responses to individual users, enhancing their experience. However, this personalization can sometimes lead to 'hallucinations,' where the model provides answers based on a user's history rather than factual correctness. This can spread false information and mislead users. To tackle this problem, the researchers developed a method called Factuality-Preserving Personalized Steering (FPPS), which helps maintain factual accuracy without sacrificing the personalized touch. They also created a new benchmark to test both factual and personalized responses, showing that FPPS improves factual reliability while keeping personalization intact.