Not All Negative Samples Are Equal: LLMs Learn Better from Plausible Reasoning
Zixiang Di, Jinyi Han et al.
TLDR: Plausible Negative Samples (PNS) improve the reasoning capabilities of Large Language Models by generating high-quality incorrect responses for training purposes.